Like when you import data into statistical packages, that data sent to Menai Insight must conform to some basic formatting requirements; as soon as your file is uploaded, we validate that it conforms to expected formatting and will let you know if changes are required. In this page, we illustrate how to format the text.
- Structure of your file
- Saving from Excel
- Including custom defined ID codes
- Including multiple files in a single upload
Your file should be in csv format, which can be exported directly from an excel file.
Each 'unit' of text to be processed (e.g., sentence) should be on a separate row in the file, under a column titled "TEXT":
If you are saving from excel, simply click Save As, and select the ".csv" option, as illustrated below.
You may include custom ID codes (or, 'keys') as part of your file; this can be useful in allowing you to integrate the processed data back into your analysis. For example, depending on how you have organized your data, you could include your own ID to correspond to each sentence that you process.
Alternatively, you could include ID's for properties such as firm, year, manager, and sentence:
ID columns should be marked preceded with the letter "ID_" and each ID label should be unique (e.g., there cannot be two columns marked "ID_FIRM"). Other than that though, there is little restrictions: we allow up to 10 ID columns, and you can use whatever ID columns are useful for you. The ID values need not be unique (although if you have a single ID column, it may make sense for them to be so).
When we process your text, if you selected a JSON export, we simply connect the ID columns to the associated sentence. If you selected a csv export, we included the ID variables as the first columns within the returned csv file.
When processing the data we ignore any column not preceded by ID, labeled TEXT. We cap the total number of columns at 11, and encourage you not to including extraneous columns, as they increase file size and transfer time.
We allow you to choose whether to process a single csv file (with up to one million sentence-rows*), or to upload a zip file containing multiple separate csv files. For example, you may find it easier to work with files containing just the sentences for an individual firm and to upload all firms to be processed together in a single zipped folder.
Files can be easily zipped together by right-clicking on the desired folder of csv files, clicking 'send to' and selecting 'Compressed (zipped) folder'.
*Note, processing should fall within your usage-limits.