Extracting tarballs is something that most unix/linux admins do autonomously. We work like that with most of our cli tools because it makes us more efficient when we think more about what we’re doing than how we want to make the tool(s) do it. But when we want to have the tool do something that isn’t habit, it feels a bit like Windows users feel when they try to close an app on OS X by moving their mouse over the top-right corner of the window only to realize the close button isn’t there.
This is how I feel every time I want to have tar extract an archive to a custom path. I’m generally doing this as part of a script and I don’t always know the name of the archive so I can’t infer the name of the directory once it’s extracted. This makes finding the issue on Google frustrating because most people suggest using the
-C flag, which is close, but not what I want. Let’s go through some examples.
First, let’s create an example archive to work with:
cd ~ mkdir test-42.0.1 touch test-41.0.1/somefile.txt tar -cf test-latest.tar test-42.0.1
This is more or less how tarballs you’ll get from teh intarwebz are created. Usually the top-level directory will match the name of the .tar file, but it certainly doesn’t have to which means we certainly can’t assume it always will. So when we’re extracting this via a script, we probably want to do something with its contents, which means we need to know the extraction location. Let’s try the
-C flag that Google results seem to love so much.
# tar's -C won't create the directory; we have to make it. mkdir /tmp/work-in-progress tar -xf test-latest.tar -C /tmp/work-in-progress ls /tmp/work-in-progress/ test-42.0.1/
That’s close, but we still have to figure out the name of the top-level directory with our specified
-C path. That’s not too hard, but I like writing as little code as possible. Let’s see if we can just make tar behave slightly better for our purposes.
rm -r /tmp/work-in-progress/* tar -xf test-latest.tar -C /tmp/work-in-progress --strip-components 1 ls /tmp/work-in-progress/ somefile.txt
And that, is precisely what I was looking for.
--strip-components 1 simply removes the first path component from the path(s) it’s extracting. In our case, it removed the top-level directory
test-42.0.1 and left us with the content, which is all we were really after. Hopefully this helps others who are googling around for something more than
-C before reading the man page like I always seem to do!