The most common use of Python Pandas is to process data. However, data could be recorded in different formats: tab-delimited text data, .csv files, .json files, .dta files, etc. Below I would like to list some sources and offer some examples of how to import certain types of data into Python Pandas and how to export them from it. Please find the Table of Contents below.


Table of Contents


Import

Fixed Width

Find link here. The example they have is:

file_name = r"/tmp/file.txt"
fwidths = [11, 11, 11, 11, 11, 11]
df = pd.read_fwf(file_name, widths = fwidths,
                 names = ['col0', 'col1', 'col2', 'col3', 'col4', 'col5'])

Excel File

Find link here. I have combined two of their examples:

pd.read_excel('tmp.xlsx', index_col = 0, sheet_name = 'Sheet3')

Stata File

Find link here.

The pandas method read_stata is not able to import value labels. In cases when we would like to import value labels, we can make use of the library pyreadstat (Github repo).

df_circum, meta = pyreadstat.read_dta('../Data/1988/y1988.dta', apply_value_formats=True)

Tab Delimited File

Find link here. Example given on the page:

df = pd.read_csv('file_location\filename.txt', delimiter = '\t')

Clipboard

df = pd.read_clipboard()

R

# read in the R file
file = pyreadr.read_r("nibrs_1991_2020_offense_segment_rds/nibrs_offense_segment_1998.rds")
dfv98 = file[None]
dfv98

Export

Stata

Official documentation here. One example is:

df.to_stata('state_expenditures.dta', write_index=False)

LaTeX

df.style.hide(axis='index').to_latex('../../mean_crime.tex',position="h!",position_float="centering",hrules=True,label="tbl:crimedate",caption="Month When Maximum is Reached")