Before diving into the API, it helps to understand a few core concepts that underpin everything in OpenAlex.
Entity Types
OpenAlex describes scholarly research as a graph of interconnected entities. There are eight entity types:
Counts are approximate and change as new data is added. Use /works?per_page=1 (etc.) for current counts.
| Entity | Description | Approx. Count |
|---|
| Works | Scholarly documents (articles, books, datasets, theses) | hundreds of millions |
| Authors | Researchers who create works | tens of millions |
| Sources | Where works are hosted (journals, repositories, conferences) | 250K+ |
| Institutions | Organizations where authors are affiliated | 110K+ |
| Topics | Subject classifications (4-level hierarchy) | 4.5K |
| Publishers | Organizations that distribute works | 10K+ |
| Funders | Organizations that fund research | 35K+ |
| Countries | Geographic information (countries, continents) | — |
Each entity type has its own API endpoint (e.g., /works, /authors).
Topic Hierarchy
Topics are organized in a four-level hierarchy:
| Level | Name | Count | Example |
|---|
| 1 | Domain | 4 | Physical Sciences |
| 2 | Field | 26 | Computer Science |
| 3 | Subfield | 254 | Artificial Intelligence |
| 4 | Topic | ~4,500 | Natural Language Processing |
Every work is assigned a primary_topic which includes the full hierarchy path. You can filter by any level: filter=primary_topic.domain.id:1, filter=primary_topic.field.id:17, etc.
OpenAlex IDs
Every entity in OpenAlex has a unique OpenAlex ID. It’s a URL formatted like this:
https://openalex.org/W2741809807
ID Structure
The ID has two parts:
- Base: Always
https://openalex.org/
- Key: A letter prefix + numeric ID (e.g.,
W2741809807)
The letter prefix indicates the entity type:
| Prefix | Entity |
|---|
| W | Work |
| A | Author |
| S | Source |
| I | Institution |
| T | Topic |
| K | Keyword |
| P | Publisher |
| F | Funder |
| G | Award (Grant) |
| C | Concept (deprecated) |
IDs are case-insensitive: W2741809807 and w2741809807 are equivalent.
Using IDs in the API
You can use just the key portion when making API calls:
# Full URL
curl "https://api.openalex.org/works/https://openalex.org/W2741809807"
# Just the key (recommended)
curl "https://api.openalex.org/works/W2741809807"
Resolve IDs
Don’t filter by names directly. Entity names are ambiguous—“MIT” could match multiple institutions, “Smith” matches thousands of authors. Always resolve names to IDs first.
When you want to find works by a specific author, institution, journal, or topic, use the two-step pattern:
- Search for the entity to get its OpenAlex ID
- Filter works using that ID
This avoids hallucinated filters and ensures you get the right entity.
The Two-Step Pattern
Step 1: Search for the Entity
Use the search endpoint for the entity type:
# Find an author
curl "https://api.openalex.org/authors?search=Albert+Einstein"
# Find an institution
curl "https://api.openalex.org/institutions?search=MIT"
# Find a journal
curl "https://api.openalex.org/sources?search=Nature"
# Find a topic
curl "https://api.openalex.org/topics?search=machine+learning"
The response includes IDs you can use:
{
"results": [
{
"id": "https://openalex.org/A5012345678",
"display_name": "Albert Einstein",
"works_count": 272
}
]
}
Step 2: Filter Works by ID
Use the ID to filter works:
# Works by this author
curl "https://api.openalex.org/works?filter=authorships.author.id:A5012345678"
# Works from this institution
curl "https://api.openalex.org/works?filter=authorships.institutions.id:I136199984"
# Works in this journal
curl "https://api.openalex.org/works?filter=primary_location.source.id:S137773608"
# Works on this topic
curl "https://api.openalex.org/works?filter=topics.id:T12345"
Common Lookups
Find Works by Author Name
# Step 1: Search for the author
curl "https://api.openalex.org/authors?search=Heather+Piwowar"
# Response: id = "https://openalex.org/A5023888391"
# Step 2: Get their works
curl "https://api.openalex.org/works?filter=authorships.author.id:A5023888391"
Find Works by Institution Name
# Step 1: Search for the institution
curl "https://api.openalex.org/institutions?search=Stanford+University"
# Response: id = "https://openalex.org/I97018004"
# Step 2: Get works from that institution
curl "https://api.openalex.org/works?filter=authorships.institutions.id:I97018004"
Find Works by Journal Name
# Step 1: Search for the journal (source)
curl "https://api.openalex.org/sources?search=Nature"
# Response: id = "https://openalex.org/S137773608"
# Step 2: Get works published there
curl "https://api.openalex.org/works?filter=primary_location.source.id:S137773608"
Find Works by Topic
# Step 1: Search for the topic
curl "https://api.openalex.org/topics?search=CRISPR"
# Response: id = "https://openalex.org/T10234"
# Step 2: Get works on that topic
curl "https://api.openalex.org/works?filter=topics.id:T10234"
When You Have External IDs
If you already have an external identifier (DOI, ORCID, ROR, ISSN), you can skip the search step and use the ID directly:
# By ORCID (author)
curl "https://api.openalex.org/works?filter=authorships.author.id:https://orcid.org/0000-0003-1613-5981"
# By ROR (institution)
curl "https://api.openalex.org/works?filter=authorships.institutions.id:https://ror.org/042nb2s44"
# By ISSN (source/journal)
curl "https://api.openalex.org/sources/issn:0028-0836"
# Then use the returned OpenAlex ID
Decision Guide
| Input Type | What to Do |
|---|
| Name (ambiguous) | Search first, then filter by ID |
| ORCID | Use directly: authorships.author.id:https://orcid.org/... |
| ROR | Use directly: authorships.institutions.id:https://ror.org/... |
| DOI | Get work directly: /works/https://doi.org/... |
| ISSN | Get source by ISSN, then filter by source ID |
| OpenAlex ID | Use directly: authorships.author.id:A123456 |
Handling Ambiguous Results
When searching returns multiple matches, you need to pick the right one:
Use Additional Filters
Narrow down the search:
# Author named "Smith" affiliated with MIT
curl "https://api.openalex.org/authors?search=Smith&filter=last_known_institution.id:I63966007"
Check Display Name and Metadata
Look at display_name, works_count, cited_by_count, and institutional affiliations to identify the right entity.
Use Autocomplete for Interactive UIs
The autocomplete endpoint is fast and returns ranked results:
curl "https://api.openalex.org/autocomplete/authors?q=einst"
Filter Field Reference
| To find works by… | Filter field |
|---|
| Author | authorships.author.id |
| Author’s institution | authorships.institutions.id |
| Primary source (journal) | primary_location.source.id |
| Any source | locations.source.id |
| Topic | topics.id or primary_topic.id |
| Publisher | primary_location.source.host_organization |
| Funder | funders.id |
Example: Complete Workflow
Find highly-cited papers about machine learning from MIT in the last 3 years:
# 1. Get MIT's ID
curl "https://api.openalex.org/institutions?search=MIT"
# Result: I63966007
# 2. Get machine learning topic ID
curl "https://api.openalex.org/topics?search=machine+learning"
# Result: T154945302
# 3. Filter works with all criteria
curl "https://api.openalex.org/works?filter=authorships.institutions.id:I63966007,topics.id:T154945302,publication_year:>2022&sort=cited_by_count:desc"
External IDs
You can also retrieve entities using external IDs like DOIs, ORCIDs, and RORs:
# By DOI
curl "https://api.openalex.org/works/https://doi.org/10.7717/peerj.4375"
# By ORCID
curl "https://api.openalex.org/authors/https://orcid.org/0000-0003-1613-5981"
# By ROR
curl "https://api.openalex.org/institutions/https://ror.org/02y3ad647"
# Shorthand format
curl "https://api.openalex.org/works/doi:10.7717/peerj.4375"
Canonical External IDs
Each entity type has a “canonical” external ID—the most widely adopted identifier for that type:
| Entity | Canonical ID |
|---|
| Works | DOI |
| Authors | ORCID |
| Sources | ISSN-L |
| Institutions | ROR |
| Topics | Wikidata ID |
| Publishers | Wikidata ID |
Merged IDs
Sometimes we merge duplicate entities (e.g., two author records for the same person). If you request a merged ID, you’ll be redirected to the new ID:
$ curl -i https://api.openalex.org/authors/A5092938886
HTTP/1.1 301 MOVED PERMANENTLY
Location: https://api.openalex.org/authors/A5006060960
Most HTTP clients handle this automatically.
Dehydrated Objects
When entities are nested inside other entities, they’re often returned in dehydrated form—a stripped-down version with only essential fields.
For example, a Work’s authorships field contains dehydrated Author objects:
{
"authorships": [
{
"author": {
"id": "https://openalex.org/A5023888391",
"display_name": "Jason Priem",
"orcid": "https://orcid.org/0000-0001-6187-6610"
},
"institutions": [
{
"id": "https://openalex.org/I4200000001",
"display_name": "OurResearch",
"ror": "https://ror.org/02nr0ka47",
"country_code": "US",
"type": "nonprofit"
}
]
}
]
}
To get the full entity, make a separate request using the ID:
curl "https://api.openalex.org/authors/A5023888391"
XPAC (Expansion Pack)
In November 2025, OpenAlex added 190+ million new works as part of an expansion called XPAC (part of the Walden rewrite). This includes:
- All of DataCite
- Thousands of institutional and subject-area repositories
- Primarily datasets and repository records
Why XPAC Works Are Excluded by Default
XPAC works have lower data quality on average (improving over time). To avoid surprising users with sudden changes in result counts and quality, XPAC works are excluded by default.
Including XPAC Works
Add include_xpac=true to any works endpoint:
# Without XPAC (default)
curl "https://api.openalex.org/works"
# With XPAC (roughly doubles the count)
curl "https://api.openalex.org/works?include_xpac=true"
Filtering by XPAC
Each work has an is_xpac boolean field:
# Get only XPAC works
curl "https://api.openalex.org/works?include_xpac=true&filter=is_xpac:true"
Query Parameter Naming
OpenAlex uses snake_case for all query parameters: filter, sort, group_by, per_page, api_key, etc.
What’s Next?