Preface The table information on CAD drawings contains a large amount of critical data. During the production process, the table data is exported to Excel. This article will introduce how to achieve intelligent recognition, automatic merging, and efficient export of tables in CAD drawings on the web using a custom MxCAD plugin. This will significantly improve the efficiency and accuracy of data extraction, as shown below:
1. Function OverviewThe main functions of this document and table extraction are as follows: 1. **Interactive Area Selection**: Users can select the target table area by clicking and dragging with the mouse. 2. **Graphic Element Recognition**: Automatically identifies lines, text, polylines, etc. within the selected range. 3. **Table Structure Reconstruction**: Reconstructs the table grid based on intersection analysis. 4. **Intelligent Cell Merging**: Supports both horizontal and vertical cell merging recognition. 5. **Content Extraction and Export**: Extracts the text content of cells and exports it as an Excel file. 2. Technical Implementation Principles2.1 Entity Acquisition and PreprocessingFirst, ask the user to specify an extraction range (a rectangular box), and then use the[MxCADSelectionSet]selection set in mxcad to select all relevant entities across the area: - const ss = new MxCADSelectionSet();
- await ss.crossingSelect(corner1.x, corner1.y, corner2.x, corner2.y);
复制代码To ensure that the entities within the nested block (BlockReference) can also be recognized, the program recursively traverses the block definition and applies the transformation matrix (`blockTransform`) to restore their true coordinate positions. - const needTransformEntity: { handle: string, mart: McGeMatrix3d }[] = [];
- const Mx_getBlokEntity = (blkRec: McDbBlockTableRecord, mart: McGeMatrix3d) => {
- blkRec.getAllEntityId().forEach(id => {
- let ent = id.getMcDbEntity();
- if (ent instanceof McDbBlockReference) {
- let blkref = ent as McDbBlockReference;
- let mat = blkref.blockTransform.clone();
- mat.preMultBy(mart);
- Mx_getBlokEntity(blkref.blockTableRecordId.getMcDbBlockTableRecord(), mat);
- } else {
- needTransformEntity.push({ handle: ent.getHandle(), mart });
- ...
- }
- })
- }
复制代码In addition, polylines will be broken down into independent straight line segments or arc segments to facilitate subsequent intersection calculations. - const explodePl = (ent: McDbPolyline, mart?: McGeMatrix3d): McDbEntity[] => {
- // If it is a polyline, it needs to be broken down into line segments.
- const numVert = ent.numVerts();
- const entsArr: McDbEntity[] = [];
- for (let i = 0; i < numVert; i++) {
- if (i < numVert - 1) {
- const convexity = ent.getBulgeAt(i);
- const pt1 = ent.getPointAt(i).val;
- const pt2 = ent.getPointAt(i + 1).val;
- if (mart) {
- pt1.transformBy(mart);
- pt2.transformBy(mart);
- }
- if (!convexity) {
- const line = new McDbLine(pt1, pt2);
- entsArr.push(line)
- } else {
- const d = (ent.getDistAtPoint(pt1).val + ent.getDistAtPoint(pt2).val) / 2;
- const midPt = ent.getPointAtDist(d).val;
- const arc = new McDbArc();
- arc.computeArc(pt1.x, pt1.y, midPt.x, midPt.y, pt2.x, pt2.y);
- entsArr.push(arc)
- }
- } else {
- if (ent.isClosed) entsArr.push(new McDbLine(ent.getPointAt(0).val, ent.getPointAt(numVert - 1).val))
- }
- }
- return entsArr;
- }
复制代码
2.2 Classification of Table Line SegmentsIn the previous steps, we have extracted all the entities within the selected area of the drawing and performed preliminary processing on some of them. Next, we need to extract all the straight lines within the selected area and classify these lines into two categories: - **Horizontal line**: Direction is close to the X-axis - **Vertical line**: Direction is close to the Y-axis The classification of a straight line is determined by the angles between the line vector and the unit vectors of the X-axis and Y-axis. - const horizontalLineArr: McDbLine[] = []; // Horizontal
- const verticalLineArr: McDbLine[] = []; // Vertical
- lineArr.forEach(item => {
- const vec_x = McGeVector3d.kXAxis;
- const vec_y = McGeVector3d.kYAxis;
- const line = item.clone() as McDbLine;
- // Determine if the line is an entity within a block. If so, the transformation matrix needs to be used to restore the true coordinate position.
- const res = needTransformEntity.find(i => i.handle === item.getHandle());
- if (res) {
- line.startPoint = line.startPoint.clone().transformBy(res.mart);
- line.endPoint = line.endPoint.transformBy(res.mart);
- }
- const _vec = line.startPoint.sub(line.endPoint).normalize().mult(precision);
- if (vec_x.angleTo1(_vec) < precision || Math.abs((vec_x.angleTo1(_vec) - Math.PI)) < precision) {
- horizontalLineArr.push(new McDbLine(line.startPoint.addvec(_vec), line.endPoint.subvec(_vec)))
- }
- if (vec_y.angleTo1(_vec) < precision || Math.abs((vec_y.angleTo1(_vec) - Math.PI)) < precision) {
- verticalLineArr.push(new McDbLine(line.startPoint.addvec(_vec), line.endPoint.subvec(_vec)))
- };
- });
复制代码
2.3 Extraction and De-duplication of Table Intersection PointsIn the previous step, we obtained all the horizontal and vertical lines. Next, we will use the intersection points between the horizontal and vertical lines to construct a table node matrix. All the intersection points are rounded to the nearest integer (for precision control) and de-duplicated to form a unique set of grid points. - // Remove duplicates from the point array
- const deduplicatePoints = (points: McGePoint3d[]): McGePoint3d[]=> {
- const allPoints: McGePoint3d[] = [];
- points.forEach((item, index) => {
- const res = points.filter((j, ind) => {
- return ind > index && item.distanceTo(j) < 0.00001
- });
- if (!res.length) allPoints.push(item)
- });
- return allPoints;
- }
- // Retrieve all the points based on the line
- const roundToPrecision = (num, precision = 0.0001): number => {
- const decimals = Math.abs(Math.floor(Math.log10(precision))); // The number of decimal places corresponding to the calculation precisionconst factor = Math.pow(10, decimals);
- return Math.round(num * factor) / factor;
- }
- let allPoints: McGePoint3d[] = [];
- horizontalLineArr.forEach(line1 => {
- verticalLineArr.forEach(line2 => {
- const res = line1.IntersectWith(line2, McDb.Intersect.kOnBothOperands);
- if (res.length()) res.forEach(pt => {
- pt.x = roundToPrecision(pt.x, precision);
- pt.y = roundToPrecision(pt.y, precision);
- if (arePointsInRectangle([new_corner1, new McGePoint3d(new_corner1.x, new_corner2.y), new_corner2, new McGePoint3d(new_corner2.x, new_corner1.y)], [pt])) {
- allPoints.push(pt)
- }
- })
- })
- });
- allPoints = deduplicatePoints(allPoints);//Remove duplicates from the point array;
复制代码
2.4 Construction of the initial cell matrixSort the intersection points based on their X and Y coordinates, and generate a two-dimensional grid structure `cellPointsArr`, where each element represents an intersection point or `null` (indicating a missing corner point),For example: - [
- [A1, B1, null, D1],
- [A2, B2, C2, D2],
- [null, B3, C3, D3]
- ]
复制代码- const _x = Array.from(new Set(allPoints.map(item => item.x))).sort((a, b) => a - b);
- const _y = Array.from(new Set(allPoints.map(item => item.y))).sort((a, b) => b - a);
- const cellPointsArr: (McGePoint3d | null)[][] = [];
- _y.forEach((y, row) => {
- const arr: (McGePoint3d | null)[] = [];
- const pts = allPoints.filter(item => item.y === y);
- if (pts.length) {
- _x.forEach((x, col) => {
- const index = pts.findIndex(item => item.x === x);
- // If the four corners of the table are missing, manually fill in the data to make the table complete.
- if (index === -1) {
- if ((row === 0 || row === _y.length - 1) && (col === 0 || row === _x.length - 1)) {
- arr.push(new McGePoint3d(x, y));
- } else {
- arr.push(null)
- }
- } else {
- arr.push(pts[index])
- }
- });
- cellPointsArr.push(arr)
- } else {
- cellPointsArr.push(null);
- }
- });
复制代码
III. Intelligent Cell Merging Mechanism3.1 Overview of Merging StrategyNext, we will adopt a two-stage merging strategy: 1. **Horizontal merging takes precedence** 2. **Vertical merging supplements** Vertical merging is triggered only when the **2×2 sub-matrix formed after horizontal merging still contains `null` elements**.
3.2 Horizontal Merging LogicThe system divides the entire table into multiple **2×2 sub-matrix blocks**, each block named after the top-left cell (for example, `B2` represents the block starting from the 2nd row and 2nd column). For each 2×2 block, if any of its four corner points are `null`, it is classified as "incomplete" and needs to be merged.
Merging Rules (Horizontal Expansion)| Condition | Search Direction | Judging Criteria| | ------------------- | -------- | ---------------------------------------------- | | The first element is `null` | Left block | Is the second element of the adjacent block on the left (such as A2) of the current block `null`? | | The second element is `null` | Right block | Is the first element of the adjacent block on the right (such as C2) of the current block `null`? | | The third element is `null` | Left block | Is the fourth element of the adjacent block on the left of the current block `null`? | | The fourth element is `null` | Right block | Is the third element of the adjacent block on the right of the current block `null`? | Example: `B2: [[null, a], [c, b]]` → Check if the second element of `A2` is `null` Using breadth-first search (BFS), collect all the "incomplete" blocks that can be horizontally connected, and form a merged group. 3.3 Vertical Merging Trigger ConditionsWhen the horizontal merging is completed, if the newly generated **2×2 peripheral matrix still contains `null`**, the vertical merging process will be initiated.
Vertical Merging Rules| Condition | Search Direction | Judging Basis | | ------------------- | -------- | -------------------------------------- | | The first element is `null` | Upper block | Is the third element of the upper block (such as B1) `null`? | | The second element is `null` | Upper block | Is the fourth element of the upper block `null`? | | The third element is `null` | Lower block | Is the first element of the lower block (such as B3) `null`? | | The fourth element is `null` | Lower block | Is the second element of the lower block `null`? | Example: `B2:[[a,null],[c,b]]` → Check if the fourth element of `B1` is `null` The program continues to expand and merge the groups until all the 2×2 blocks within the bounding box are included, ultimately forming a complete rectangular area. 3.4 Result Generation after MergingAfter the merging process is completed, the system calculates the minimum row/column and the maximum row/column to generate a new 2×2 matrix representing the four corner points of the merged area, and records the original cell range (such as "A1+B1+A2+B2").  - // Merge tablesfunction solveWithMerging(input: MatrixValue[][]): MergeResult[] {
- const rows = input.length;
- const cols = input[0].length;
- if (rows < 2 || cols < 2) {
- return;
- }
- // 1. Extract all 2x2 submatrices
- const blocks: Record<string, MatrixValue[][]> = {};
- const positions: Record<string, Position> = {};
- for (let r = 0; r <= rows - 2; r++) {
- for (let c = 0; c <= cols - 2; c++) {
- const key = `${String.fromCharCode(65 + c)}${r + 1}`;
- blocks[key] = [
- [input[r][c], input[r][c + 1]],
- [input[r + 1][c], input[r + 1][c + 1]]
- ];
- positions[key] = { row: r, col: c };
- }
- }
- // Tool: Determine if there is a null value
- const hasNull = (mat: MatrixValue[][]): boolean =>
- mat.some(row => row.some(cell => cell === null));
- const processed = new Set<string>(); // Blocks that have participated in the merger
- const results: MergeResult[] = [];
- // Select all blocks const getAllBlockNames = (visited: Set<string>): { fullRangeKeys: string[], newMatrix: MatrixValue[][] } => {
- // Obtain the bounding box (the original merged area)
- let minRow = Infinity, maxRow = -Infinity;
- let minCol = Infinity, maxCol = -Infinity;
- Array.from(visited).forEach(key => {
- const { row, col } = positions[key];
- minRow = Math.min(minRow, row);
- maxRow = Math.max(maxRow, row);
- minCol = Math.min(minCol, col);
- maxCol = Math.max(maxCol, col);
- });
- // ===== Expansion: Generate all 2×2 blocks within the bounding box (complete rectangular area) ===== const fullRangeKeys: string[] = [];
- for (let r = minRow; r <= maxRow; r++) {
- for (let c = minCol; c <= maxCol; c++) {
- const key = `${String.fromCharCode(65 + c)}${r + 1}`;
- fullRangeKeys.push(key);
- // Mark these blocks as processed (to prevent repetition in separate blocks)
- processed.add(key);
- }
- };
- // Extract the new 2x2 matrix (the four corners)
- const safeGet = (r: number, c: number): MatrixValue =>
- r < rows && c < cols ? input[r][c] : null;
- const newMatrix: MatrixValue[][] = [
- [safeGet(minRow, minCol), safeGet(minRow, maxCol + 1)],
- [safeGet(maxRow + 1, minCol), safeGet(maxRow + 1, maxCol + 1)]
- ];
- return { fullRangeKeys, newMatrix }
- }
- // ===== Stage 1: Handling merged groups containing null values =====
- for (const startKey in blocks) {
- if (processed.has(startKey) || !hasNull(blocks[startKey])) continue;
- const visited = new Set<string>();
- const queue: string[] = [startKey];
- visited.add(startKey);
- processed.add(startKey);
- while (queue.length > 0) {
- const key = queue.shift()!;
- const { row, col } = positions[key];
- const block = blocks[key];
- const [a, b] = block[0];
- const [c, d] = block[1];
- const leftKey = col > 0 ? `${String.fromCharCode(64 + col)}${row + 1}` : null;
- const rightKey = col < cols - 2 ? `${String.fromCharCode(66 + col)}${row + 1}` : null;
- // First, perform horizontal merging. If the condition is met, exit the loop.
- // Rule 1: The first element is null → it should be replaced by the third element above or the second element on the left. if (a === null) {
- if (leftKey && blocks[leftKey] && !visited.has(leftKey) && blocks[leftKey][0][1] === null) {
- visited.add(leftKey);
- queue.push(leftKey);
- processed.add(leftKey);
- }
- }
- // Rule 2: The second element, null → The fourth element on the top or the first element on the right
- if (b === null) {
- if (rightKey && blocks[rightKey] && !visited.has(rightKey) && blocks[rightKey][0][0] === null) {
- visited.add(rightKey);
- queue.push(rightKey);
- processed.add(rightKey);
- }
- }
- // Rule 3: The third element (null) → The first element below or the fourth element on the left if (c === null) {
- if (leftKey && blocks[leftKey] && !visited.has(leftKey) && blocks[leftKey][1][1] === null) {
- visited.add(leftKey);
- queue.push(leftKey);
- processed.add(leftKey);
- }
- }
- // Rule 4: The fourth element (null) → The second element below or the third element on the right
- if (d === null) {
- if (rightKey && blocks[rightKey] && !visited.has(rightKey) && blocks[rightKey][1][0] === null) {
- visited.add(rightKey);
- queue.push(rightKey);
- processed.add(rightKey);
- }
- };
- }
- if (visited.size === 1) queue.push(startKey);
- if (!getAllBlockNames(visited).newMatrix.flat().every(item => item !== null)) {
- while (queue.length > 0) {
- const key = queue.shift()!;
- const { row, col } = positions[key];
- const block = blocks[key];
- const [a, b] = block[0];
- const [c, d] = block[1];
- const upKey = row > 0 ? `${String.fromCharCode(65 + col)}${row}` : null;
- const downKey = row < rows - 2 ? `${String.fromCharCode(65 + col)}${row + 2}` : null;
- // Rule 1: The first element is null → The third element above or the second element on the left
- if (a === null) {
- if (upKey && blocks[upKey] && !visited.has(upKey) && blocks[upKey][1][0] === null) {
- visited.add(upKey);
- queue.push(upKey);
- processed.add(upKey);
- }
- }
- // Rule 2: The second element, null → The fourth element on the top or the first element on the right
- if (b === null) {
- if (upKey && blocks[upKey] && !visited.has(upKey) && blocks[upKey][1][1] === null) {
- visited.add(upKey);
- queue.push(upKey);
- processed.add(upKey);
- }
- }
-
- // Rule 3: The third element (null) → The first element below or the fourth element on the left
-
- if (c === null) {
- if (downKey && blocks[downKey] && !visited.has(downKey) && blocks[downKey][0][0] === null) {
- visited.add(downKey);
- queue.push(downKey);
- processed.add(downKey);
- }
- }
- // Rule 4: The fourth element (null) → The second element below or the third element on the right
- if (d === null) {
- if (downKey && blocks[downKey] && !visited.has(downKey) && blocks[downKey][0][1] === null) {
- visited.add(downKey);
- queue.push(downKey);
- processed.add(downKey);
- }
- };
- }
- }
- const { fullRangeKeys, newMatrix } = getAllBlockNames(visited);
- const isOnlyCol = (cells: string[]): Boolean => {
- const prefixes = new Set<string>();
- for (const cell of cells) {
- // Extract the initial letter portion (consecutive uppercase A-Z)
- const match = cell.match(/^[A-Z]+/);
- if (match) {
- prefixes.add(match[0]);
- }
- }
- return prefixes.size === 1;
- }
- if (isOnlyCol(fullRangeKeys)) {
- results.push({
- merged: {
- fullRangeKeys: fullRangeKeys, // The complete range after renaming
- matrix: newMatrix
- }
- });
- } else {
- // After obtaining all the merged elements, recombine them.
- const res = combineSubMatrices(input, fullRangeKeys);
- res.forEach(item => {
- results.push({
- merged: {
- fullRangeKeys: getAllBlockNames(new Set(item.name.split('+'))).fullRangeKeys, // The complete range after renaming
- matrix: item.data
- }
- });
- })
- }
- }
- // ===== Stage 2: Processing Independent Blocks (Not Merged and Not Overwritten) =====
- for (const key in blocks) {
- if (!processed.has(key)) {
- results.push({
- standalone: {
- key,
- matrix: blocks[key]
- }
- });
- }
- }
- return results
- }
- type Matrix = any[][];
- type SubMatrix2x2 = MatrixValue[][];
-
- interface CombineResult<T> {
- name: string;
- data: SubMatrix2x2;
- }
- /**
- * Generate all combinations of left block and right block, retaining only those where the row number of the left block is ≤ the row number of the right block.
- * Rules:
- * - Left block: Submatrix of the leftmost column (Column A)
- * - Right block: Submatrix of the rightmost column (Column C)
- * - Combination: Xr + Ys, where r ≤ s
- * - Output: All combinations that meet the conditions
- */
- // Changed to support any type T
- function combineSubMatrices<T>(matrix: Matrix, inputNames: string[]): CombineResult<T>[] {
- if (!matrix || matrix.length === 0 || matrix[0].length < 2) {
- throw new Error("Matrix must be at least 1x2");
- }
- const nameToPosition = new Map<string, { row: number; col: number }>();
- // Parse input name
- for (const rawName of inputNames) {
- const name = rawName.trim().toUpperCase();
- const match = name.match(/^([A-Z])(\d+)$/);
- if (!match) continue;
- const colIndex = match[1].charCodeAt(0) - 65;
- const rowIndex = parseInt(match[2], 10) - 1;
- if (rowIndex >= 0 && colIndex >= 0 &&
- rowIndex <= matrix.length - 2 && colIndex <= matrix[0].length - 2) {
- nameToPosition.set(name, { row: rowIndex, col: colIndex });
- }
- }
-
- if (nameToPosition.size === 0) {
- console.log("No valid submatrices found in input.");
- return [];
- }
- // Group by column
- const colGroups = new Map<number, Map<number, string>>(); // col -> row -> name
- nameToPosition.forEach((pos, name) => {
- if (!colGroups.has(pos.col)) {
- colGroups.set(pos.col, new Map());
- }
- colGroups.get(pos.col)!.set(pos.row, name);
- })
- // Identify the leftmost column (the left block) and the rightmost column (the right block)
- const cols = Array.from(colGroups.keys()).sort((a, b) => a - b);
- if (cols.length < 2) {
- console.log("Need at least two columns for combination.");
- return [];
- }
- const leftCol = cols[0];
- const rightCol = cols[cols.length - 1];
- const leftColMap = colGroups.get(leftCol)!;
- const rightColMap = colGroups.get(rightCol)!;
- // Obtain all line numbers
- const leftRows = Array.from(leftColMap.keys()).sort((a, b) => a - b);
- const rightRows = Array.from(rightColMap.keys()).sort((a, b) => a - b);
- const results: CombineResult<T>[] = [];
- // Generate all combinations of left block and right block, and retain only those where the row number of the left block is less than or equal to that of the right block.
- for (const leftRow of leftRows) {
- const leftName = leftColMap.get(leftRow)!;
- const leftRowNum = leftRow + 1; // 0-based to 1-based
- for (const rightRow of rightRows) {
- const rightName = rightColMap.get(rightRow)!;
- const rightRowNum = rightRow + 1;
- // Only retain the combinations where the left block row number is less than or equal to the right block row number.
- if (leftRowNum > rightRowNum) continue;
- const combinedName = `${leftName}+${rightName}`;
- try {
- // Unified rule: For Xr + Ys
- // - [0][0]: Top-left corner of Xr
- // - [0][1]: Top-right corner of Yr (same as the row number of the left block)
- // - [1][0]: Bottom-left corner of Ys (same as the row number of the right block)
- // - [1][1]: Bottom-right corner of Ys
- const yRowName = `${String.fromCharCode(65 + rightCol)}${leftRowNum}`;
- const xSRowName = `${String.fromCharCode(65 + leftCol)}${rightRowNum}`;
- if (!nameToPosition.has(yRowName) || !nameToPosition.has(xSRowName)) {
- console.warn(`Required blocks not found for ${combinedName}: ${yRowName}, ${xSRowName}`);
- continue;
- }
- const yRowPos = nameToPosition.get(yRowName)!;
- const xSRowPos = nameToPosition.get(xSRowName)!;
- const topLeft = matrix[leftRow][leftCol];
- const topRight = matrix[yRowPos.row][yRowPos.col + 1];
- const bottomLeft = matrix[xSRowPos.row + 1][xSRowPos.col];
- const bottomRight = matrix[rightRow + 1][rightCol + 1];
- const data: SubMatrix2x2 = [
- [topLeft, topRight],
- [bottomLeft, bottomRight]
- ];
- if (!data.flat().filter(item => !item).length) {
- results.push({ name: combinedName, data });
- break;
- }
- } catch (error) {
- console.warn(`Error processing ${combinedName}:`, error);
- continue;
- }
- }
- }
- return results;
- }
IV. Text Extraction and Excel Export4.1 Text Matching Traverse all text entities (`McDbText` / `McDbMText`), and determine whether the geometric center of each entity falls within a certain cell range. If a match is successful, append the content of the entity to the corresponding cell. - /**
- * Determine whether all the points are within the rectangular area (including the boundaries)
- * @param rectPoints - The four vertices of the rectangle (the order does not matter, it should be an axis-aligned rectangle)
- * @param points - Array of points
- * @returns True if both points are within the rectangle, otherwise false
- */ function arePointsInRectangle(
- rectPoints: McGePoint3d[],
- points: McGePoint3d[],
- ): boolean {
- // Extract all x and y coordinates
- const xs = rectPoints.map(p => p.x);
- const ys = rectPoints.map(p => p.y);
- const minX = Math.min(...xs);
- const maxX = Math.max(...xs);
- const minY = Math.min(...ys);
- const maxY = Math.max(...ys);
- /**
- * Check whether a single point is within the rectangular boundary (including the boundary) const isPointInRect = (p: McGePoint3d): boolean => {
- return p.x >= minX && p.x <= maxX && p.y >= minY && p.y <= maxY;
- };
- // Both points must be within the rectangle. return points.every(pt => isPointInRect(pt));
- }
- // Filter out all the table data
- const tableDataArr: CellInput[] = []
- const results = solveWithMerging(cellPointsArr);
- const getTextContent = (matrix: McGePoint3d[][]): string => {
- let str: string = '';
- const textArr = scopeAllEntity.filter(item => {
- const ent = item.clone() as McDbEntity;
- let _minPt: McGePoint3d, _maxPt: McGePoint3d
- if (ent instanceof McDbText) {
- const { minPt, maxPt } = ent.getBoundingBox();
- _minPt = minPt;
- _maxPt = maxPt;
- } else if (item instanceof McDbMText) {
- const textStyleId = MxCpp.getCurrentMxCAD().getDatabase().getCurrentlyTextStyleId();
- ent.textStyleId = textStyleId;
- (ent as McDbMText).reCompute();
- const { minPt, maxPt } = MxCADUtility.getTextEntityBox(ent, false);
- _minPt = minPt;
- _maxPt = maxPt;
- }
- if (_maxPt && _minPt) {
- // matrix flattening
- const res = needTransformEntity.find(i => i.handle === item.getHandle())
- if (res) {
- _minPt.transformBy(res.mart);
- _maxPt.transformBy(res.mart);
- }
- return arePointsInRectangle(matrix.flat(), [_minPt.clone().addvec(_maxPt.sub(_minPt).mult(1 / 2))])
- } else {
- return false
- }
- })
- if (textArr.length) {
- textArr.forEach(text => {
- if (text instanceof McDbText) {
- str += `${text.textString}\n`
- } else if (text instanceof McDbMText) {
- str += `${text.contents}\n`
- }
- })
- };
- return str
- }
- results.forEach(async res => {
- if (res.merged) {
- const { fullRangeKeys, matrix } = res.merged;
- const str = getTextContent(matrix);
- tableDataArr.push({ type: DataType.merged, content: str, name: fullRangeKeys.join('+') })
- } else if (res.standalone) {
- const { key, matrix } = res.standalone;
- const str = getTextContent(matrix);
- tableDataArr.push({ type: DataType.standalone, content: str, name: key });
- }
- });
复制代码
4.2 Excel Output Use the `ExcelJS` library to create a workbook and perform the following operations: - **Merge cells**: Set across rows and columns based on `fullRangeKeys` - **Fill content**: Write the extracted text - **Style beautification**: Add borders, center alignment, and automatic line wrapping - **File export**: Generate a Blob for download in the browser, and save as `.xlsx` file on the Node.js side - /**
- Export the cell data to Excel
- */
- async function exportExcelFromCells(
- data: CellInput[],
- filename: string = 'tableData.xlsx'
- ) {
- const workbook = new ExcelJS.Workbook();
- const worksheet = workbook.addWorksheet('Sheet1');
- const cellRegex = /^([A-Z]+)(\d+)$/;
- const parsedMerges: { start: { row: number; col: number }; end: { row: number; col: number } }[] = [];
- const cellsToSet: { row: number; col: number; value: string }[] = [];
- /**
- * The format of A1 is {row, col}
- */
- function parseCellRef(cellName: string): { row: number; col: number } {
- const match = cellName.match(cellRegex);
- if (!match) throw new Error(`Invalid cell name: ${cellName}`);
- const [, colStr, rowStr] = match;
- let col = 0;
- for (let i = 0; i < colStr.length; i++) {
- col = col * 26 + (colStr.charCodeAt(i) - 64);
- }
- return { row: parseInt(rowStr), col };
- }
- // Step 1: Process all the data
- for (const item of data) {
- if (item.type === DataType.merged) {
- const cellNames = item.name.split('+').map(s => s.trim());
- const positions = cellNames.map(parseCellRef);
- const startRow = Math.min(...positions.map(p => p.row));
- const endRow = Math.max(...positions.map(p => p.row));
- const startCol = Math.min(...positions.map(p => p.col));
- const endCol = Math.max(...positions.map(p => p.col));
- parsedMerges.push({
- start: { row: startRow, col: startCol },
- end: { row: endRow, col: endCol }
- });
- worksheet.mergeCells(startRow, startCol, endRow, endCol);
- const masterCell = worksheet.getCell(startRow, startCol);
- masterCell.value = item.content;
- masterCell.alignment = { horizontal: 'center', vertical: 'middle' };
- } else if (item.type === DataType.standalone) {
- const pos = parseCellRef(item.name);
- cellsToSet.push({ row: pos.row, col: pos.col, value: item.content });
- }
- }
- // Step 2: Set up separate cells (skip merged areas)
- for (const cell of cellsToSet) {
- const isOverlapped = parsedMerges.some(merge =>
- cell.row >= merge.start.row &&
- cell.row <= merge.end.row &&
- cell.col >= merge.start.col &&
- cell.col <= merge.end.col
- );
- if (!isOverlapped) {
- const wsCell = worksheet.getCell(cell.row, cell.col);
- wsCell.value = cell.value;
- }
- }
- // Step 3: Add border styles to all used cells
- // Correct syntax: TypeScript compatible
- const borderStyle = {
- top: { style: 'thin' as const, color: { argb: 'FF000000' } },
- left: { style: 'thin' as const, color: { argb: 'FF000000' } },
- bottom: { style: 'thin' as const, color: { argb: 'FF000000' } },
- right: { style: 'thin' as const, color: { argb: 'FF000000' } }
- };
- // Obtain the maximum row and column range
- let maxRow = 1;
- let maxCol = 1;
- [...cellsToSet, ...parsedMerges.flatMap(merge => [
- merge.start, { row: merge.end.row, col: merge.end.col }
- ])].forEach(pos => {
- maxRow = Math.max(maxRow, pos.row);
- maxCol = Math.max(maxCol, pos.col);
- });
- // Add borders to all the cells that might be used
- for (let row = 1; row <= maxRow; row++) {
- for (let col = 1; col <= maxCol; col++) {
- const cell = worksheet.getCell(row, col);
- if (cell.value !== null && cell.value !== undefined) {
- cell.border = borderStyle;
- // Optional: Default to center alignment
- if (!cell.alignment) {
- cell.alignment = { horizontal: 'center', vertical: 'middle', wrapText: true };
- }
- }
- }
- }
- // Browser environment
- const buffer = await workbook.xlsx.writeBuffer();
- const blob = new Blob([buffer], {
- type: 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
- });
- // @ts-ignore
- saveAs(blob, filename);
- }
- worksheet.mergeCells(startRow, startCol, endRow, endCol);
- masterCell.value = item.content;
- masterCell.alignment = { horizontal: 'center', vertical: 'middle' };
复制代码
V. Practical Results Based on the implementation of the above steps, we can obtain a preliminary demo for extracting the table data. If there are other table situations, one can refer to the above implementation ideas and develop more functions for recognizing tables on this basis. The practical effect of the demo we developed for extracting the table is as follows:
|