Create Extracts for Household-Person Microdata Collections
Below we provide examples in curl showing how to work with the IPUMS API to create and manage data extracts for the household-person microdata collections supported by the API (IPUMS USA, CPS and International). Our examples are for IPUMS CPS but they will function the same for USA or International - simply update the collection=
string in the API URL and swap out variable and sample names from those data collections instead.
Get your key from your IPUMS user account management page at https://account.ipums.org/api_keys.
Load Libraries and Set Key
# set the IPUMS_API_KEY environment variable using bash shell
export IPUMS_API_KEY=YOUR_API_KEY_HERE
Submit a Data Extract Request
To submit a data extract request, you will construct a JSON payload manually (if you are not using one of the aforementioned R or Python client libraries). Once you have your request formed, you will then submit it to the API.
The names to use for samples and variables in the data extract request can be discovered on our website. See Explore IPUMS Household-Person Microdata Collection Metadata for more information.
curl
program in a bash
command line environment to obtain a subset of variables from the 2018 and 2019 CPS ASEC data.# construct the JSON payload manually and submit it
curl --location --request POST 'https://api.ipums.org/extracts?collection=cps&version=2' \
--header "Authorization: $IPUMS_API_KEY" \
--header 'Content-Type: application/json' \
--data-raw '{
"description": "Example extract",
"dataStructure": {
"rectangular": {
"on": "P"
}
},
"dataFormat": "fixed_width",
"samples": {
"cps2018_03s": {},
"cps2019_03s": {}
},
"variables":{
"AGE": {},
"SEX": {},
"RACE": {},
"STATEFIP": {}
}
}'
# A successful request will return a response that includes an extract number in the number attribute:
{
"number": 1,
"status": "queued",
"downloadLinks": {},
"extractDefinition": {
"version": 2,
"dataStructure": {
"rectangular": {
"on": "P"
}
},
"dataFormat": "fixed_width",
"caseSelectWho": "individuals",
"description": "Example extract",
"samples": {
"cps2018_03s": {},
"cps2019_03s": {}
},
"variables": {
"YEAR": {
"preselected": true
},
"SERIAL": {
"preselected": true
},
"MONTH": {
"preselected": true
},
"CPSID": {
"preselected": true
},
"ASECFLAG": {
"preselected": true
},
"ASECWTH": {
"preselected": true
},
"STATEFIP": {},
"PERNUM": {
"preselected": true
},
"CPSIDP": {
"preselected": true
},
"CPSIDV": {
"preselected": true
},
"ASECWT": {
"preselected": true
},
"AGE": {},
"SEX": {},
"RACE": {}
},
"collection": "cps"
}
}
You can also submit hierarchical extracts.
# construct the JSON payload manually and submit it
curl --location --request POST 'https://api.ipums.org/extracts?collection=cps&version=2' \
--header "Authorization: $IPUMS_API_KEY" \
--header 'Content-Type: application/json' \
--data-raw '{
"description": "Example hierarchical extract",
"dataStructure": {
"hierarchical": {}
},
"dataFormat": "fixed_width",
"samples": {
"cps2018_03s": {},
"cps2019_03s": {}
},
"variables":{
"AGE": {},
"SEX": {},
"RACE": {},
"STATEFIP": {}
}
}'
# A successful request will return a response that includes an extract number in the number attribute:
{
"number": 2,
"status": "queued",
"downloadLinks": {},
"extractDefinition": {
"version": 2,
"dataStructure": {
"hierarchical": {}
},
"dataFormat": "fixed_width",
"caseSelectWho": "individuals",
"description": "Example hierarchical extract",
"samples": {
"cps2018_03s": {},
"cps2019_03s": {}
},
"variables": {
"RECTYPE": {},
"YEAR": {
"preselected": true
},
"SERIAL": {
"preselected": true
},
"MONTH": {
"preselected": true
},
"CPSID": {
"preselected": true
},
"ASECFLAG": {
"preselected": true
},
"ASECWTH": {
"preselected": true
},
"STATEFIP": {},
"PERNUM": {
"preselected": true
},
"CPSIDP": {
"preselected": true
},
"CPSIDV": {
"preselected": true
},
"ASECWT": {
"preselected": true
},
"AGE": {},
"SEX": {},
"RACE": {}
},
"collection": "cps"
}
}
Checking a Request’s Status
After submitting your extract request, you can use the API to check status using the extract’s number.
curl --request GET 'https://api.ipums.org/extracts/1?collection=cps&version=2' --header 'Content-Type: application/json' --header "Authorization: $IPUMS_API_KEY"
# A successful request will provide a response object like below. The exact fields may vary depending on how far along the extract is in processing.
# You will get a status such as `queued`, `started`, `produced` `canceled`, `failed` or `completed` in the status field.
{
"number": 1,
"status": "completed",
"downloadLinks": {
"basicCodebook": {
"url": "https://api.ipums.org/downloads/cps/api/v1/extracts/1234567/cps_00001.cbk",
"bytes": 8492,
"sha256": "37ce64df8300c73736e7fcfd6c4afb9faaddceddfe3a73bcaa435984ce3c2765"
},
"stataCommandFile": {
"url": "https://api.ipums.org/downloads/cps/api/v1/extracts/1234567/cps_00001.do",
"bytes": 11741,
"sha256": "fbad44930e54c3889a3e2e1eb3d04c48ed34743ceb70fcdda66168183ff1670b"
},
"data": {
"url": "https://api.ipums.org/downloads/cps/api/v1/extracts/1234567/cps_00001.dat.gz",
"bytes": 4577233,
"sha256": "68851b34fa1841a145403b033786f1bb125fbbc8f85297f776075df187e3a41f"
},
"rCommandFile": {
"url": "https://api.ipums.org/downloads/cps/api/v1/extracts/1234567/cps_00001.R",
"bytes": 406,
"sha256": "3009d74bdadd0fadab6b18e9deafc7833a8ef6e8117e8ab3b68008f2f7c64296"
},
"spssCommandFile": {
"url": "https://api.ipums.org/downloads/cps/api/v1/extracts/1234567/cps_00001.sps",
"bytes": 5945,
"sha256": "a29ff3ebc41a9bdc3ac65b71ceeef8179b12135607950aeee93ca362f7aa68b7"
},
"ddiCodebook": {
"url": "https://api.ipums.org/downloads/cps/api/v1/extracts/1234567/cps_00001.xml",
"bytes": 44616,
"sha256": "fea25a5b6af215142fa55cf3fd9ec2784532643021ebecac4d56e0d17ba1e935"
},
"sasCommandFile": {
"url": "https://api.ipums.org/downloads/cps/api/v1/extracts/1234567/cps_00001.sas",
"bytes": 5990,
"sha256": "45f571f15fb17d9dd326db094e26f9ecc335b87629b00d4813cbb47fdc855a64"
}
},
"extractDefinition": {
"version": 2,
"dataStructure": {
"rectangular": {
"on": "P"
}
},
"dataFormat": "fixed_width",
"caseSelectWho": "individuals",
"description": "Example extract",
"samples": {
"cps2018_03s": {},
"cps2019_03s": {}
},
"variables": {
"YEAR": {
"preselected": true
},
"SERIAL": {
"preselected": true
},
"MONTH": {
"preselected": true
},
"CPSID": {
"preselected": true
},
"ASECFLAG": {
"preselected": true
},
"ASECWTH": {
"preselected": true
},
"STATEFIP": {},
"PERNUM": {
"preselected": true
},
"CPSIDP": {
"preselected": true
},
"CPSIDV": {
"preselected": true
},
"ASECWT": {
"preselected": true
},
"AGE": {},
"SEX": {},
"RACE": {}
},
"collection": "cps"
}
}
Retrieving Your Extract
To retrieve a completed extract, we will once again do so with the API using the extract’s number.
# download the data file using link that came back in extract request status object once completed
curl -H "Authorization: $IPUMS_API_KEY" https://api.ipums.org/downloads/cps/api/v1/extracts/1234567/cps_00001.dat.gz > my_ipums_cps_extract_1_dat.gz
# repeat for the other files e.g. codebook etc...
Now you are ready for further processing and analysis as you desire.
Get a Listing of Recent Extract Requests
You may also find it useful to get a historical listing of your extract requests.
curl -X GET \
https://api.ipums.org/extracts?collection=cps&version=2 \
-H 'Content-Type: application/json' \
-H "Authorization: $IPUMS_API_KEY"
# If you omit an extract number in your API call, by default this will return the 10 most recent extract requests. To adjust the amount returned, you may optionally specify a `?limit=##` parameter to get the ## most recent extracts instead.