Hard Drive Investigation

Hard Drive Investigation

Daily short news for you
  • These past few days, well not exactly, due to the recent WWDC event, Apple has been the subject of online discussions about where their AI features actually stand. While other companies are diving into bringing AI to their devices and software, Apple seems to be... not too concerned.

    Recently, Apple's researchers suggested that LLM models will "completely collapse in accuracy" when faced with extremely complex problems. By pointing out that reasoning is merely an illusion, many rebuttals to this research emerged immediately. Once again, it raises the question of what Apple is truly thinking regarding AI on their devices.

    I think it's quite simple, Apple seems to be struggling with creating AI for themselves. That is, they are facing difficulties right from the data collection stage for training. They always appear to respect user privacy, so would they really go online to scrape data from everywhere, or "steal" data from users' devices? Surely, they wouldn't want to provide more user data to third parties like OpenAI.

    However, perhaps these challenges will lead them to discover a new direction. If everyone chooses the easy path, who will share the hardships? 😁. Oh, I'm not an "Apple fan," I just use what suits me 🤓.

    » Read more
  • A "sensitive" person to markdown is someone who jumps right in to see what's new when they come across a library that creates a new editor. Milkdown/milkdown is one example.

    Taking a look, it seems quite good, everyone. I might try integrating it into opennotas to see how it goes. It's supposed to be a note-taking application that supports markdown, but the library tiptap doesn't seem to want to add markdown support 😩. Using an external library isn't quite satisfactory yet.

    » Read more
  • Everyone using Cloudflare Worker to call OpenAI's API should be careful, I've encountered the error unsupported_country_region_territory these past few days. It's likely that the Worker server is calling from a region that OpenAI does not support.

    It's strange because this error has only occurred recently 🤔

    » Read more

The Issue

Hello readers, I sincerely apologize for not writing any new articles in the past week. As I shared in Threads, last week I was quite busy with a project at the company, and when I got home, I could only update a little information during the day before going to bed to continue listening to the unfinished book. It is "The Fountainhead" - telling the story of a young architect who was expelled because he... had ideas ahead of his time. Wow, what do architects and programmers have in common? Yet the story has become so engaging that I couldn't put it down. And as you see here, this is a redemption article.

Last week, I had to troubleshoot a strange error on the server. A friend from the mobile app development team reported that he couldn't make an API call to the development (dev) server. After sitting together to debug, we discovered that the same API worked when called with a small amount of data. Conversely, when trying to send a large amount of data, we immediately received a 500 error response from the server, which was returned from nginx.

Then I checked the nginx logs and saw an error that looked like this:

[crit] 1530375#1530375: *11347 pwrite() "/var/lib/nginx/body/0000000020" failed (28: No space left on device), client: ****, server: ****, request: "POST **** HTTP/2.0", host: "****"

There was no doubt, the hard drive was full. Many might wonder why no one knew the hard drive was full; this is an old server on GCP, which couldn't install monitoring tools, so we had to wait until an error occurred to find out.

Immediately, I started tracking down what was causing the hard drive to be full, as this server didn't have much on it and served only for internal development purposes.

Investigation

The first thing I thought of was the log files, which over the years could "bulk up" very quickly and could be the primary cause of the hard drive filling up quickly. However, after entering the nginx logs directory, I realized this wasn't the issue, as these files were inherently small and couldn't fill up the hard drive that fast. Wow! 🤔 I must investigate on a larger scale.

First, let's check the actual status of the hard drive using the df -h command, and something hit me like a wall when I saw that the hard drive was at 100% below.

$ df -h  
Filesystem      Size  Used Avail Use% Mounted on  
/dev/root       9.6G  9.6G  0G  100% /  
devtmpfs        2.0G     0  2.0G   0% /dev  
tmpfs           2.0G     0  2.0G   0% /dev/shm  
tmpfs           394M   41M  354M  11% /run  
tmpfs           5.0M     0  5.0M   0% /run/lock  
...  

Indeed, the hard drive was full, so where was the memory allocated? Let's go back to the root directory and check using the du -sh */ command.

$ cd /  
$ du -sh */  
160M	bin/  
159M	boot/  
0	    dev/  
5.0M	etc/  
545M	home/  
...  

A bit chaotic, right? Let's sort the results in descending order for easier viewing.

$ du -sh */ | sort -hr  
4.8G	var/  
2.3G	snap/  
988M	lib/  
545M	home/  
529M	usr/  

So it's clear, the var directory is doing something that occupies nearly 5GB of hard drive space while the total capacity is less than 10GB. Continuing to repeat the steps while moving into the var directory, I found a directory named cache occupying almost half of the 4.8GB. After confirming that it was just temporary storage, I could safely delete it and free up 2GB of memory for my old server.

At this point, everything was almost resolved. But looking back, I realized that my process was quite time-consuming and manual. In a server environment, we mainly use commands, but think about using an operating system like Windows, where you can browse through drives and directories to see what's there and delete unnecessary files. Is there a way to do that in a command-line environment?

Disk Analysis Tool

After some searching, I discovered a tool called ncdu - which helps analyze hard drive space and supports browsing through directories to see all the data within. ncdu supports multiple operating systems, including Ubuntu. This project is open-source.

$ sudo apt install ncdu  

The usage is straightforward; after installation, navigate to the root directory and run a command for ncdu to collect hard drive data.

$ cd /  
$ ncdu -o ~/report.ncdu  

Wait a moment for a report file to be generated. Then proceed to view it by opening the report.

$ ncdu -f ~/report.ncdu  

Immediately, you will see a visual interface showing where memory is allocated, using the arrow keys to navigate through the directories. At this point, it feels just like on Windows, and we can easily see the status of our hard drive.

NCDU

One more thing that I personally find useful is that for MacOS users, every time you check the memory in Settings and feel "hot-faced" when seeing "System Data" take up half of your hard drive space without knowing what it is or where to delete it, try using ncdu and you might be in for some surprises 🫣

MacOS System Data

Premium
Hello

Me & the desire to "play with words"

Have you tried writing? And then failed or not satisfied? At 2coffee.dev we have had a hard time with writing. Don't be discouraged, because now we have a way to help you. Click to become a member now!

Have you tried writing? And then failed or not satisfied? At 2coffee.dev we have had a hard time with writing. Don't be discouraged, because now we have a way to help you. Click to become a member now!

View all

Subscribe to receive new article notifications

or
* The summary newsletter is sent every 1-2 weeks, cancel anytime.

Comments (0)

Leave a comment...