What File Types and Sizes Can ChatGPT, Claude, and Gemini Accept?

If you’re regularly using large files and obscure file formats, you know that general AI platforms like Claude and ChatGPT often run out of tokens fast when you’re trying to upload a big file.

What File Formats & Sizes Can Top AI Platforms Accept?

At a glance, this section showcases ChatGPT, Gemini, & Claude’s accepted file formats and sizes in comparison with Lium’s AI platform built for big, complex data:

ChatGPT

What File Formats Does ChatGPT Always Accept?

ChatGPT’s max file size that it accepts is 512 MB. That doesn’t work for people in industries that require dense, complex data for informed decision-making. To make matters worse, ChatGPT and the like also can’t accept file formats like NetCDF and SEG-Y that are commonly used in advanced industries.

So what file sizes and formats can Claude, Gemini, and ChatGPT accept? And is there an alternative that can process larger, less common file formats for the industries that need it? Here’s what to know about current general AI’s file processing capabilities, and the solutions that Lium offers for larger datasets (for free!)

Start for Free

See the platform with your data type

Let’s breakdown each platform’s file processing capabilities:

ChatGPT

What File Formats Does ChatGPT Always Accept?

You can upload documents, spreadsheets, and images directly to ChatGPT, including the following file formats:

Documents: DOCX, PDF, TXT, RTF
Spreadsheets: XLSX, XLS, CSV
Presentations: PPTX
Images: JPG, WEBP, PNG
Code Files: PY, JS, HTML, TSV, SQL, etc.
Audio Files: MP3

In some instances, but not always, other file types can be interpreted as well, but not with full reliability.

When ChatGPT was asked directly about formats that aren’t fully supported, it responded that there are other additional file types it may be able to successfully crawl, “but parsing/analysis reliability varies based on codec, encoding, container structure, corruption, size or platform implementation,” meaning it may not be able to provide answers to your questions based on the provided data.

What File Formats Does ChatGPT Not Accept?

For advanced sectors using obscure file formats, ChatGPT cannot often crawl and/or interpret their data. When prompted itself about file formats it cannot process, OpenAI’s LLM called out many files used commonly in these advanced industries:

Geospatial / GIS: .gdb, .mdb, .sid, .ecw, .e00, .adf, .3mx, .slpk, large .las/.laz point clouds
Satellite / Remote Sensing: .nitf/.nsif, .SAFE, .he5, .img (ERDAS), .pix, proprietary SAR collections, hyperspectral cubes
Energy / Utilities: .sav, .dyr, .aux, Petrel project files, CMG reservoir simulation files, ECLIPSE reservoir models (.DATA, .GRID, .UNRST), SEG-D seismic files
Engineering / CAD: .rvt, .max, .blend, .prt, .par, .asm, .jt, .cgr, complex BIM coordination models
Scientific Research: .root, .sav (SPSS), .zsav, .nii/.nii.gz, .czi, .lif, .nd2, large microscopy stacks
Finance / Quantitative: Bloomberg .bbg, Reuters proprietary feeds, kdb+/q databases, proprietary tick databases, market replay archives
Manufacturing / Industrial: .acd, .zap, Siemens TIA Portal project files, Fanuc robot programs, industrial historian archives, PLC firmware images
Life Sciences / Biotech: .cram, .gff, .gtf, .loom, .h5ad, .cif, molecular dynamics trajectory files (.dcd, .xtc)

Areas that the platform commonly struggles include proprietary binaries, compiled artifacts, and hardware-dependent formats.

Even in some instances that these formats are crawled successfully, ChatGPT often can’t semantically understand the content, meaning it can’t provide the fully-accurate answers your complex questions demand.

What Maximum File Sizes Can ChatGPT Accept?

General file size: For any file, including documents, spreadsheets, code, and proprietary files, the maximum file size is 512 megabytes (MB).
For text documents, the maximum number of tokens is 2 million, or about 1500 standard pages of text.
For image files: 20 MB per image (for both direct uploads as well as embedded visual content)

If you’re in a data-dense sector, the 512 MB limit means you can’t even upload large documents and spreadsheets, let alone complex proprietary datasets.

Claude

Which File Types Can Be Crawled & Processed by Claude?

Claude can accept the following types of file formats (as long as they’re below 500 MB)

Documents: PDF, DOCX, TXT, HTML, ODT, RTF, EPUB, JSON
Spreadsheets: CSV, XLSX
Presentations: PPTX
Images: JPEG, PNG, GIF, WebP
Code: Python (.py), JavaScript (.js), TypeScript (.ts), and other common code file formats

What File Sizes Can Claude Accept?

Even the aforementioned “accepted” file types cannot be processed beyond 500 MBs as of June 2026.

For a full project, the maximum file limit on Anthropic Claude is 30 MB per file (with unlimited uploads but as a requirement to fit within Claude’s context window.)

What File Formats is Claude Unable to Upload?

Claude runs into challenges processing and interpreting a wide-range of file types commonly used in advanced sectors, including:

Geospatial / GIS: .shp, .shx, .dbf, .tif/.tiff, .kml/.kmz, .las/.laz, .gdb, .sid
Satellite / Remote Sensing: .ntf, .h5/.hdf5, .nc, .fits, .grb/.grib2
Energy / Utilities: .osh, .xml (CIM), .raw, .las (well logs), ECLIPSE simulation files
Engineering / CAD: .dwg, .dxf, .rvt, .stp/.step, .igs, .stl, .x_t/.x_b, .CATPart/.CATProduct
Scientific Research: .mat, .rds/.rda/.rdata, .h5, .parquet, .dcm, .mrc, .nii
Finance / Quantitative: FIX logs, .bbg, .tick, HDF5 time series archives
Manufacturing / Industrial: .prt, .sldprt/.sldasm, .l5x/.acd, OPC-UA exports
Life Sciences / Biotech: .fastq/.fasta, .bam/.sam, .vcf, .mol/.sdf, .pdb

While some of these file types can be processed by Claude (with others on the list being flat out rejected by the model), Anthropic’s LLM admits itself that while it, “can receive [some of these file formats] and extract whatever plain text or metadata is readable, [it] cannot process the actual data structure, spatial relationships, or specialized encoding.”

That doesn’t work for advanced sectors that require fully accurate and multimodal decision-making.

Gemini

What Are the Standard File Formats Supported by Gemini?

Google Gemini is able to process the following file types:

Documents: PDF, DOCX, TXT, HTML
Spreadsheets: XLSX, CSV
Presentations: PPTX, Slides export
Images: JPG, PNG, WebP, SVG
Audio: WAV, MP3, FLAC
Video: MP4, MOV
Code: .py, .js, .java, .cpp, .html
ZIP (images/frames): .zip

What is Gemini’s Maximum File Size Limit?

When uploading files to the Gemini App, keep the following restrictions in mind:

Documents: 100 MB per file and a cap of 10 files per prompt
Images: 100 MB per file
Videos: Up to 2 Gigabytes (GB) and a 5 minute length with basic plan
Audio: 100 MB per file and up to 10 minute length with basic plan
Code & ZIP files: 100 MB with a max of 5,000 files within a single archive.

What Advanced File Formats Cannot Be Fully Processed & Integrated by Gemini?

You’ll need to manually convert the following data types to CSVs or JSON to crawl and fully interpret them in Gemini (as long as they fit within its maximum file size).

If the file is in its standard format or too big, you can’t work with the following advanced file formats on Gemini:

Geospatial / GIS: .shp, .shx, .dbf, .kml/.kmz, .gdb, .tif/.tiff, .las/.laz, .sid
Engineering / CAD: .stp/.step, .igs/.iges, .stl, .x_t/.x_b, .prt, .sldprt/.sldasm, .rvt, .dwg/.dxf
Scientific Research: .mat, .rds/.rda, .h5/.hdf5, .parquet, .dcm, .nii, .mrc
Life Sciences / Biotech: .fastq/.fasta, .bam/.sam, .vcf, .mol/.sdf, .pdb
Manufacturing / Industrial: .l5x/.acd, OPC-UA exports, ECLIPSE simulation files, .osh
Satellite / Remote Sensing: .ntf, .nc, .fits, .grb/.grib2
Finance / Quantitative: .bbg, .tick, FIX logs

Can Lium Process & Interpret Large, Complex File Formats?

Yes, while advanced industries hit roadblocks with general AI platforms’ inability to process and interpret most complex datasets, Lium was built with advanced industries and their proprietary data in mind.

When you connect ANY file format, regardless of size, Lium automatically indexes the selected files so that it can be interpreted by the answer engine in a few moments. That means all of your proprietary data can be fetched by Lium within its original file format, unlike general AI that requires the contents to be converted to an acceptable format and actually uploaded to the platform.

Not convinced? Try Lium yourself for free. Complex, domain-specific file formats that Lium regularly crawls and extracts insights from (without the need for format conversion) include RDF, NetCDF, GRIB2, HDF5, SEG-Y, CCSDS, BUFR, BIM/IFC, FASTQ, FITS. The file is the input.

Why General AI’s Inability to Crawl These File Types & Sizes is Only Part of the Problem (& How Lium Solves It)

Even if platforms like ChatGPT, Claude, and Gemini could process your dataset, your proprietary data is often so advanced and requires such a nuanced understanding of your sector that it can’t answer questions with the precision and depth you need. And when general AI treats every file as a one-session interaction, you can’t leverage it for multimodal reasoning where you need to extract and blend data from multiple datasets at once.

Lium’s advanced industry AI was built to not only process large, complex datasets, but to interpret them multimodally to provide a POV from an industry-expert’s perspective.

Here’s where general AI hits roadblocks, and how Lium was built to deliver accurate results with industry-specific context:

Data-dense Reasoning:

LLMs carry strong general knowledge, but the expertise that drives critical decisions inside a specialized industry lives in internal data, institutional processes, and domain-specific knowledge accumulated over years.

Models like ChatGPT typically treat uploaded files as a reference while the model continues to reason from its general training on public Internet data. That is NOT the same as reasoning over your own data directly.

What You Need: True Proprietary Reasoning, Not Public Bias

Lium works directly within your proprietary data environment. It surfaces answers that exist entirely within your data, rather than answers from general AI that can be skewed by publicly crawled data.

One-Session Interactions with Data

General AI tools like Claude, ChatGPT, and Gemini have announced updates that allow for memory, moving toward session continuity and persistent context. This is meaningful progress IF you’re using AI for general knowledge tasks. But when you’re working with advanced proprietary data, it still isn’t cutting it.

Memory is not the same as a purpose-built data environment. Your organization's workflows depend on proprietary datasets that are indexed, structured, and refined to accurately reflect the realities of your businesses unique operations and pain points, general memory features fall short. They carry context forward from session and session, but do NOT build a compounding, reusable knowledge base around your data.

You Need AI Built for Advanced Data

Lium’s agentic harness is designed to bring nuanced understanding to the complex questions in high-stake advanced sectors. Once data is integrated into your workflow and complex questions are asked and answered within Lium’s environment, the knowledge base grows.

Whether you’re working with geospatial, subsurface, financial, or any type of complex data, you can run that workflow without any development background. Lium’s answer engine reflects the deep sector knowledge your organization’s proprietary data offers, not a general model's approximation of it.

See Lium’s Ability to Crawl Any File Type in a Moment

Don’t let general AI’s inability to read and reason with obscure file types hold you back from getting fast answers to your most complicated problems.

Sign up for free today to see Lium’s ability to crawl even the most complex data sets and provide 100% accurate answers with nuanced understanding of your advanced industry.

Start for Free

See the platform with your data type

Written by Harrison Kelly

Technology Writer

Harrison Kelly is a B2B SEO & Content Marketing Consultant and freelance writer with more than a decade working and writing for technology companies. Notable software brands that Harrison has published work for include ZenDesk, SkyFi satellites, GovPilot, Classmates.com, and Belong Home. He graduated from The College of New Jersey with a business degree. He is a daily artificial intelligence user for solving complex problems and performing processes quickly.

What File Types and Sizes Can ChatGPT, Claude, and Gemini Accept?

Technology Writer

President + Co-founder

If you’re regularly using large files and obscure file formats, you know that general AI platforms like Claude and ChatGPT often run out of tokens fast when you’re trying to upload a big file.

What File Formats & Sizes Can Top AI Platforms Accept?

ChatGPT

What File Formats Does ChatGPT Always Accept?

Start for Free

ChatGPT

What File Formats Does ChatGPT Always Accept?

What File Formats Does ChatGPT Not Accept?

What Maximum File Sizes Can ChatGPT Accept?

Claude

Which File Types Can Be Crawled & Processed by Claude?

What File Sizes Can Claude Accept?

What File Formats is Claude Unable to Upload?

Gemini

What Are the Standard File Formats Supported by Gemini?

What is Gemini’s Maximum File Size Limit?

What Advanced File Formats Cannot Be Fully Processed & Integrated by Gemini?

Can Lium Process & Interpret Large, Complex File Formats?

Why General AI’s Inability to Crawl These File Types & Sizes is Only Part of the Problem (& How Lium Solves It)

Data-dense Reasoning:

What You Need: True Proprietary Reasoning, Not Public Bias

One-Session Interactions with Data

You Need AI Built for Advanced Data

See Lium’s Ability to Crawl Any File Type in a Moment

Start for Free

Technology Writer

Insights

AI for Hedge Funds: Predictive Analysis Using Advanced AI in 2026

What Is an Agent Harness?

How Data Blending Works in 2026

AI for Hedge Funds: Predictive Analysis Using Advanced AI in 2026

What Is an Agent Harness?

How Data Blending Works in 2026

Ask anything, Lium answers.