1. Get the first line which often contains column names and dump it into a small text file
uki $ head -n 1 source_file_name.txt > header_line.txt
2. Get first record after the headline and dump it into a small text file
uki $ head -n 2 source_file_name.txt | tail -1 > first_data_line.txt
3. Finally, when developing using large files, I take SAMPLE 1000 records (out of millions) to speed up the dev time, I use 1000 because that is default SELECT * number of records in MySQL, but you can use any other if you want, but I would not go too small as you many not catch memory leak errors. The random number 2500 in this example I would change occasionally to pull different sample. You do want to sample your data in different places.
uki $ head -n 2500 source_file_name.txt | tail -1000 > sample_1000_records.txt
No comments:
Post a Comment