MongoDB GridFS
In MongoDB, GridFS is a file system used for storing large files. GridFS allows us to store files larger than 16MB, which is the size limit for a single MongoDB document.
Basic Concepts
How GridFS Works
GridFS splits large files into multiple small chunks (with a default size of 255KB) and stores these chunks in two collections:
- files Collection: Stores metadata of the file.
- chunks Collection: Stores content chunks of the file.
When we need to read a file, GridFS combines these chunks and returns the complete file.
Characteristics of GridFS
- Storing Large Files: GridFS allows us to store files larger than 16MB.
- Distributed Storage: GridFS can store files on multiple servers.
- Data Backup and Recovery: GridFS provides data backup and recovery functions.
- Metadata Management: GridFS allows us to store metadata of files, such as filename, size, type, etc.
Using GridFS
Uploading a File
// Upload a file
const fs = require("fs")
const { MongoClient } = require("mongodb")
async function uploadFile() {
const client = new MongoClient("mongodb://localhost:27017")
await client.connect()
const db = client.db("mydatabase")
const bucket = new GridFSBucket(db)
const readableStream = fs.createReadStream("/path/to/file")
const uploadStream = bucket.openUploadStream("filename")
readableStream.pipe(uploadStream)
uploadStream.on("finish", () => {
console.log("File uploaded successfully")
client.close()
})
}
uploadFile()Downloading a File
// Download a file
const fs = require("fs")
const { MongoClient } = require("mongodb")
async function downloadFile() {
const client = new MongoClient("mongodb://localhost:27017")
await client.connect()
const db = client.db("mydatabase")
const bucket = new GridFSBucket(db)
const downloadStream = bucket.openDownloadStreamByName("filename")
const writableStream = fs.createWriteStream("/path/to/destination")
downloadStream.pipe(writableStream)
writableStream.on("finish", () => {
console.log("File downloaded successfully")
client.close()
})
}
downloadFile()Querying a File
// Query a file
const { MongoClient } = require("mongodb")
async function queryFile() {
const client = new MongoClient("mongodb://localhost:27017")
await client.connect()
const db = client.db("mydatabase")
const filesCollection = db.collection("fs.files")
const file = await filesCollection.findOne({ filename: "filename" })
console.log(file)
client.close()
}
queryFile()Deleting a File
// Delete a file
const { MongoClient } = require("mongodb")
async function deleteFile() {
const client = new MongoClient("mongodb://localhost:27017")
await client.connect()
const db = client.db("mydatabase")
const bucket = new GridFSBucket(db)
await bucket.delete(ObjectId("5e8f8f8f8f8f8f8f8f8f8f8f"))
console.log("File deleted successfully")
client.close()
}
deleteFile()Best Practices for GridFS
Choose the Right File Size
Although GridFS allows us to store large files, we should avoid storing files that are too small. If the file size is less than 16MB, we should store the file directly in the document instead of using GridFS.
Optimize Queries
We should optimize queries to improve query performance. For example, we can create indexes on the files collection to improve the performance of querying file metadata.
Regular Cleaning
We should regularly clean up files that are no longer used to save storage space.
Summary
In MongoDB, GridFS is a file system used for storing large files. GridFS splits large files into multiple small chunks and stores these chunks in two collections. When we need to read a file, GridFS combines these chunks and returns the complete file. GridFS allows us to store files larger than 16MB, with distributed storage, data backup and recovery functions, and metadata management functions. When using GridFS, we should choose the right file size, optimize queries, and clean up regularly to ensure the efficient operation of GridFS.