Hello readers, I sincerely apologize for not writing any new articles in the past week. As I shared in Threads, last week I was quite busy with a project at the company, and when I got home, I could only update a little information during the day before going to bed to continue listening to the unfinished book. It is "The Fountainhead" - telling the story of a young architect who was expelled because he... had ideas ahead of his time. Wow, what do architects and programmers have in common? Yet the story has become so engaging that I couldn't put it down. And as you see here, this is a redemption article.
Last week, I had to troubleshoot a strange error on the server. A friend from the mobile app development team reported that he couldn't make an API call to the development (dev) server. After sitting together to debug, we discovered that the same API worked when called with a small amount of data. Conversely, when trying to send a large amount of data, we immediately received a 500 error response from the server, which was returned from nginx.
Then I checked the nginx logs and saw an error that looked like this:
[crit] 1530375#1530375: *11347 pwrite() "/var/lib/nginx/body/0000000020" failed (28: No space left on device), client: ****, server: ****, request: "POST **** HTTP/2.0", host: "****"
There was no doubt, the hard drive was full. Many might wonder why no one knew the hard drive was full; this is an old server on GCP, which couldn't install monitoring tools, so we had to wait until an error occurred to find out.
Immediately, I started tracking down what was causing the hard drive to be full, as this server didn't have much on it and served only for internal development purposes.
The first thing I thought of was the log files, which over the years could "bulk up" very quickly and could be the primary cause of the hard drive filling up quickly. However, after entering the nginx logs directory, I realized this wasn't the issue, as these files were inherently small and couldn't fill up the hard drive that fast. Wow! 🤔 I must investigate on a larger scale.
First, let's check the actual status of the hard drive using the df -h
command, and something hit me like a wall when I saw that the hard drive was at 100% below.
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/root 9.6G 9.6G 0G 100% /
devtmpfs 2.0G 0 2.0G 0% /dev
tmpfs 2.0G 0 2.0G 0% /dev/shm
tmpfs 394M 41M 354M 11% /run
tmpfs 5.0M 0 5.0M 0% /run/lock
...
Indeed, the hard drive was full, so where was the memory allocated? Let's go back to the root directory and check using the du -sh */
command.
$ cd /
$ du -sh */
160M bin/
159M boot/
0 dev/
5.0M etc/
545M home/
...
A bit chaotic, right? Let's sort the results in descending order for easier viewing.
$ du -sh */ | sort -hr
4.8G var/
2.3G snap/
988M lib/
545M home/
529M usr/
So it's clear, the var
directory is doing something that occupies nearly 5GB of hard drive space while the total capacity is less than 10GB. Continuing to repeat the steps while moving into the var
directory, I found a directory named cache
occupying almost half of the 4.8GB. After confirming that it was just temporary storage, I could safely delete it and free up 2GB of memory for my old server.
At this point, everything was almost resolved. But looking back, I realized that my process was quite time-consuming and manual. In a server environment, we mainly use commands, but think about using an operating system like Windows, where you can browse through drives and directories to see what's there and delete unnecessary files. Is there a way to do that in a command-line environment?
After some searching, I discovered a tool called ncdu - which helps analyze hard drive space and supports browsing through directories to see all the data within. ncdu
supports multiple operating systems, including Ubuntu. This project is open-source.
$ sudo apt install ncdu
The usage is straightforward; after installation, navigate to the root directory and run a command for ncdu
to collect hard drive data.
$ cd /
$ ncdu -o ~/report.ncdu
Wait a moment for a report file to be generated. Then proceed to view it by opening the report.
$ ncdu -f ~/report.ncdu
Immediately, you will see a visual interface showing where memory is allocated, using the arrow keys to navigate through the directories. At this point, it feels just like on Windows, and we can easily see the status of our hard drive.
One more thing that I personally find useful is that for MacOS users, every time you check the memory in Settings and feel "hot-faced" when seeing "System Data" take up half of your hard drive space without knowing what it is or where to delete it, try using ncdu
and you might be in for some surprises 🫣
5 profound lessons
Every product comes with stories. The success of others is an inspiration for many to follow. 5 lessons learned have changed me forever. How about you? Click now!
Subscribe to receive new article notifications
Comments (0)