

Since the file stream is past the headers at this point, it will not give you any issues: import pandas as pdĭf = pd. To read in a pandas DataFrame, you can just pass the file object to pandas.read_csv. You can combine the nested output with and the outer input with into a single outer block to reduce the nesting levels: with open('file.dat') as file, open('outputfile.csv', 'w') as output: To write the remainder to a CSV file, without having to load the entire thing into memory at once, use csv.writer and the iterator from above: import csv The important thing here is that you have iterator pointing to the actual data in the file stream, and a bunch of dynamically loaded column headers. You can now convert the file to a true CSV, or do something else with it. Header = get_words_and_positions(next(iterator)) + \

You can convert it to a different format, or even import it into a pandas DataFrame directly. If you do it right, you will be left with an iterator over a file stream that you can use to process the remainder of the data as you wish. You can skip the first two lines, and combine the next two. Lastly, you’ll get a new CSV file like it’s shown in the following picture. Now, select File Download Comma Separated Values (.csv).

Consequently, it’ll open the file in the spreadsheet. It looks like you can combine the header rows dynamically based on a word's position in the line. Then, select the desired Excel workbook and press Import data.
