Introduction
When working with a large number of files or directories on the command line, you may encounter the “argument list too long” error. This error occurs when the number of arguments passed to a command exceeds the system’s allowed limit. In this article, we explain the cause of this error and how to address it.
Cause of the Error
When passing arguments to a command on the command line, for example by writing ls dir1
, dir1
is the argument. However, if the number of these arguments exceeds the system’s allowable limit, the “argument list too long” error occurs.
This limit can be checked with the command getconf ARG_MAX
and is set to about 2MB on many systems. Since this value is defined at compile time for the system, it cannot be changed dynamically.
For example, running:
|
|
shows 2MB on my system.
This problem often occurs when executing commands that use wildcards:
|
|
In the example above, the shell expands the wildcard, and the command ls dir*
is replaced with something like:
|
|
If there are, say, 1,000,000 directories matching dir, the expanded argument list easily exceeds the limit, leading to the “argument list too long” error.
Solution 1: Using the xargs Command
When handling a large number of files or directories, the xargs
command is very effective. xargs
takes input from standard input and passes it as arguments to a specified command for execution. For example:
|
|
The operation of xargs
is as follows:
- It reads strings from standard input and appends them as arguments to the specified command (in this case,
ls
). - It then executes the constructed command.
In the example above, it constructs a command like:
|
|
At first glance, you might think that this would still result in an “argument list too long” error, but xargs
is designed to build a command line only up to the system’s limit for command-line length. Once that limit is reached, xargs
executes the command and then starts a new command with the remaining arguments. For example, it might execute:
|
|
You can check the command line length limit that xargs
uses by running it with the --show-limits
option:
|
|
In the above output, the line Size of command buffer we are actually using
shows approximately 130k on my system. Although the system limit is 2MB, the default value without specifying the -s
option is set to 130k. If you use the -s
option to specify a buffer size of 2MB, the value will change:
|
|
Furthermore, if you specify a value that exceeds the system limit (for example, 3MB) for Size of command buffer we are actually using
, the value will be corrected to the system limit:
|
|
Let’s Actually Verify How xargs Works
Even if you understand the convenience of xargs
, you might be hesitant to run it without testing. Here, we’ll create a test environment to see how it actually behaves.
Creating a Test Environment
Let’s create a test environment to verify the behavior of the xargs
command.
In this example, we’ll simulate a case with 1000 directories, each containing 100 files. First, we’ll create the directories and files for the test case. In this example, we’ll set up the environment in a directory called test
.
|
|
In the fourth line, we’re already using xargs
because in this case the expanded arguments would immediately cause an argument list too long
error, so its use is unavoidable.
Here, the -I
option is used when you want to embed the string passed from standard input at a specific location in the command. If you simply run something like xargs ls
, the input is only appended to the end of the ls
command. Using the -I
option allows you to embed the input exactly where you need it. In this case, {}
is replaced with dir0001, so in effect the command becomes:
|
|
You can use any string in place of {}
, like aaa
or bbb
, but by convention {}
is typically used, and it’s recommended to stick with it to avoid confusion.
Triggering the argument list too long
Error
Now, let’s check if the argument list too long
error occurs in our test environment.
|
|
Attempting to list all *.txt
files results in an argument list too long
error, confirming that the limit is indeed exceeded.
Listing the .txt Files One by One
Next, let’s list the .txt
files in each directory one by one.
|
|
After outputting the results of the ls
command to a file and counting the lines, you’ll see there are exactly 100,000 lines (the extra line is due to the final newline).
This confirms that we can process a large number of files one by one. However, you might want to preview the actual commands that will be executed beforehand. In such cases, you can use the -p option.
|
|
When executed, it displays an extremely long command followed by ?...
and then waits for input. If you type y
or Y
, the displayed command is executed; if you simply press Enter, the command is skipped and the next one is shown.
Especially when executing commands that perform destructive actions (like deletion), it’s a good idea to use the -p option to verify that the arguments are as expected before proceeding.
Actually Deleting Files
If everything up to this point has been confirmed, deleting a large number of files becomes straightforward. Let’s try deleting file001.txt
through file080.txt
in each directory.
First, as a precaution, let’s verify that attempting to delete them all at once fails:
|
|
As expected, an error occurs and the files are not deleted because there are too many arguments.
Now, let’s retrieve the files one by one and use xargs
to construct and execute the deletion command. For this test, we will delete them all at once without confirming using the -p
option.
|
|
After deletion, verifying the contents shows that file001.txt
through file080.txt
have been properly removed.
Alternative Solution: Using the -exec Option of the find Command
Instead of using xargs
, you can also resolve the “argument list too long” issue by using the -exec
option with the find
command. With the -exec
option, find
executes a command directly on each file it finds.
|
|
The above command works like this:
|
|
This method also avoids the “argument list too long” error, but since it executes the command for each file individually, it is significantly less efficient than xargs
. However, in cases where processing files in bulk is not required, using find
with the -exec
option can be more concise than using xargs
.
For example, the combination of find
and the -exec
option is effective in the following scenarios.
1. Finding and Deleting Specific Files
If you need to search for and delete files with a specific extension, you can use the -exec
option to execute the rm
command directly on each file found.
|
|
This command searches for all files with the .tmp
extension within /path/to/directory
and deletes each one. It’s a straightforward and readable solution for simple tasks, compared to using xargs
.
2. Executing a Specific Command on Files
If you need to execute a command on the found files—for example, changing file permissions—you can use the -exec
option to run the chmod
command on each file.
|
|
3. Chaining Multiple Commands
Using the -exec
option, you can also chain multiple commands to execute on each file. For instance, if you want to find specific files, display their contents, and then delete them, you can write:
|
|
This command searches for files with the .log
extension, displays their contents (cat
), and then deletes them (rm
). This approach is useful when you need to perform multiple steps in a single command-line operation.
Summary
The argument list too long
error is a common issue when handling a large number of files on the command line. However, by properly using xargs
or the -exec
option with find
, you can work around this problem and streamline your tasks. For more detailed information and options, please refer to the respective command manuals.
JM Project find
JM Project xargs
Execution Environment
|
|