Skip to content

MongoDB GridFS

In MongoDB, GridFS is a file system used for storing large files. GridFS allows us to store files larger than 16MB, which is the size limit for a single MongoDB document.

Basic Concepts

How GridFS Works

GridFS splits large files into multiple small chunks (with a default size of 255KB) and stores these chunks in two collections:

  1. files Collection: Stores metadata of the file.
  2. chunks Collection: Stores content chunks of the file.

When we need to read a file, GridFS combines these chunks and returns the complete file.

Characteristics of GridFS

  1. Storing Large Files: GridFS allows us to store files larger than 16MB.
  2. Distributed Storage: GridFS can store files on multiple servers.
  3. Data Backup and Recovery: GridFS provides data backup and recovery functions.
  4. Metadata Management: GridFS allows us to store metadata of files, such as filename, size, type, etc.

Using GridFS

Uploading a File

javascript
// Upload a file
const fs = require("fs")
const { MongoClient } = require("mongodb")

async function uploadFile() {
  const client = new MongoClient("mongodb://localhost:27017")
  await client.connect()

  const db = client.db("mydatabase")
  const bucket = new GridFSBucket(db)

  const readableStream = fs.createReadStream("/path/to/file")
  const uploadStream = bucket.openUploadStream("filename")

  readableStream.pipe(uploadStream)

  uploadStream.on("finish", () => {
    console.log("File uploaded successfully")
    client.close()
  })
}

uploadFile()

Downloading a File

javascript
// Download a file
const fs = require("fs")
const { MongoClient } = require("mongodb")

async function downloadFile() {
  const client = new MongoClient("mongodb://localhost:27017")
  await client.connect()

  const db = client.db("mydatabase")
  const bucket = new GridFSBucket(db)

  const downloadStream = bucket.openDownloadStreamByName("filename")
  const writableStream = fs.createWriteStream("/path/to/destination")

  downloadStream.pipe(writableStream)

  writableStream.on("finish", () => {
    console.log("File downloaded successfully")
    client.close()
  })
}

downloadFile()

Querying a File

javascript
// Query a file
const { MongoClient } = require("mongodb")

async function queryFile() {
  const client = new MongoClient("mongodb://localhost:27017")
  await client.connect()

  const db = client.db("mydatabase")
  const filesCollection = db.collection("fs.files")

  const file = await filesCollection.findOne({ filename: "filename" })
  console.log(file)

  client.close()
}

queryFile()

Deleting a File

javascript
// Delete a file
const { MongoClient } = require("mongodb")

async function deleteFile() {
  const client = new MongoClient("mongodb://localhost:27017")
  await client.connect()

  const db = client.db("mydatabase")
  const bucket = new GridFSBucket(db)

  await bucket.delete(ObjectId("5e8f8f8f8f8f8f8f8f8f8f8f"))
  console.log("File deleted successfully")

  client.close()
}

deleteFile()

Best Practices for GridFS

Choose the Right File Size

Although GridFS allows us to store large files, we should avoid storing files that are too small. If the file size is less than 16MB, we should store the file directly in the document instead of using GridFS.

Optimize Queries

We should optimize queries to improve query performance. For example, we can create indexes on the files collection to improve the performance of querying file metadata.

Regular Cleaning

We should regularly clean up files that are no longer used to save storage space.

Summary

In MongoDB, GridFS is a file system used for storing large files. GridFS splits large files into multiple small chunks and stores these chunks in two collections. When we need to read a file, GridFS combines these chunks and returns the complete file. GridFS allows us to store files larger than 16MB, with distributed storage, data backup and recovery functions, and metadata management functions. When using GridFS, we should choose the right file size, optimize queries, and clean up regularly to ensure the efficient operation of GridFS.

Content is for learning and research only.