LeetCode 194: Transpose File
A clear explanation of the Transpose File shell problem using awk to transform rows into columns.
Problem Restatement
We are given a text file named file.txt.
Each line contains several words separated by spaces.
We need to transpose the file.
That means:
- The first column becomes the first row.
- The second column becomes the second row.
- The third column becomes the third row.
- And so on.
Example input:
name age
alice 21
ryan 30
Expected output:
name alice ryan
age 21 30
The original rows become columns.
Input and Output
| Item | Meaning |
|---|---|
| Input | A file named file.txt |
| Output | The transposed version of the file |
| Separator | Words are separated by spaces |
| Language | Bash shell script |
Examples
Input:
name age
alice 21
ryan 30
Visual form:
| Column 1 | Column 2 |
|---|---|
| name | age |
| alice | 21 |
| ryan | 30 |
After transposing:
| Row 1 | name alice ryan |
|---|---|
| Row 2 | age 21 30 |
Output:
name alice ryan
age 21 30
Another example:
Input:
a b c
d e f
g h i
Output:
a d g
b e h
c f i
First Thought
We need to rebuild the file column by column.
While reading each row, we can collect values belonging to the same column.
For example:
a b c
d e f
g h i
Column 1 becomes:
a d g
Column 2 becomes:
b e h
Column 3 becomes:
c f i
This suggests storing data indexed by column number.
Key Insight
awk automatically splits each line into fields.
For example:
a b c
Inside awk:
| Variable | Value |
|---|---|
$1 |
a |
$2 |
b |
$3 |
c |
NF |
3 |
NF means "number of fields".
So while processing each line, we can append every field to a column buffer.
Example:
| Column Buffer | Value |
|---|---|
col[1] |
a d g |
col[2] |
b e h |
col[3] |
c f i |
After reading the whole file, we print each buffer.
Algorithm
For every line:
- Loop through all fields from
1toNF. - If the column already has content, append a space.
- Append the current field to that column.
- Track the maximum number of columns.
After processing all lines:
- Print every column buffer from
1tomax_columns.
Correctness
Each line is split into fields automatically by awk.
For every field position i, the algorithm appends the current field $i to the buffer corresponding to column i.
Therefore:
- All first-column values are stored together.
- All second-column values are stored together.
- And so on.
The order of insertion matches the original row order.
So each column buffer becomes exactly one row of the transposed matrix.
Finally, printing the buffers from left to right outputs the correct transpose of the original file.
Complexity
Suppose the file contains r rows and c columns.
| Metric | Value | Why |
|---|---|---|
| Time | O(r × c) |
Every field is visited once |
| Space | O(r × c) |
The transposed content is stored in memory |
Implementation
awk '
{
for (i = 1; i <= NF; i++) {
if (NR == 1) {
col[i] = $i
} else {
col[i] = col[i] " " $i
}
}
}
END {
for (i = 1; i <= NF; i++) {
print col[i]
}
}
' file.txt
Code Explanation
The main loop processes each line:
{
for (i = 1; i <= NF; i++) {
NF is the number of fields in the current line.
So the loop visits every column.
For the first row:
if (NR == 1)
we initialize the column buffer directly.
NR means "number of records", which is the current line number.
For later rows:
col[i] = col[i] " " $i
we append the new value with a space separator.
At the end:
END {
runs after the entire file has been processed.
Then:
print col[i]
prints each transposed row.
Testing
Create a sample file:
cat > file.txt << 'EOF'
name age
alice 21
ryan 30
EOF
Run the solution:
awk '
{
for (i = 1; i <= NF; i++) {
if (NR == 1) {
col[i] = $i
} else {
col[i] = col[i] " " $i
}
}
}
END {
for (i = 1; i <= NF; i++) {
print col[i]
}
}
' file.txt
Expected output:
name alice ryan
age 21 30
Another test:
Input:
a b c
d e f
g h i
Expected output:
a d g
b e h
c f i
Single-row test:
Input:
x y z
Expected output:
x
y
z