Extracting tarballs To A Custom Path

Extracting tarballs is something that most unix/linux admins do autonomously. We work like that with most of our cli tools because it makes us more efficient when we think more about what we’re doing than how we want to make the tool(s) do it. But when we want to have the tool do something that isn’t habit, it feels a bit like Windows users feel when they try to close an app on OS X by moving their mouse over the top-right corner of the window only to realize the close button isn’t there.

This is how I feel every time I want to have tar extract an archive to a custom path. I’m generally doing this as part of a script and I don’t always know the name of the archive so I can’t infer the name of the directory once it’s extracted. This makes finding the issue on Google frustrating because most people suggest using the -C flag, which is close, but not what I want. Let’s go through some examples.

First, let’s create an example archive to work with:

cd ~
mkdir test-42.0.1
touch test-41.0.1/somefile.txt
tar -cf test-latest.tar test-42.0.1

This is more or less how tarballs you’ll get from teh intarwebz are created. Usually the top-level directory will match the name of the .tar file, but it certainly doesn’t have to which means we certainly can’t assume it always will. So when we’re extracting this via a script, we probably want to do something with its contents, which means we need to know the extraction location. Let’s try the -C flag that Google results seem to love so much.

# tar's -C won't create the directory; we have to make it.
mkdir /tmp/work-in-progress
tar -xf test-latest.tar -C /tmp/work-in-progress
ls /tmp/work-in-progress/
test-42.0.1/

That’s close, but we still have to figure out the name of the top-level directory with our specified -C path. That’s not too hard, but I like writing as little code as possible. Let’s see if we can just make tar behave slightly better for our purposes.

rm -r /tmp/work-in-progress/*
tar -xf test-latest.tar -C /tmp/work-in-progress --strip-components 1
ls /tmp/work-in-progress/
somefile.txt

And that, is precisely what I was looking for. --strip-components 1 simply removes the first path component from the path(s) it’s extracting. In our case, it removed the top-level directory test-42.0.1 and left us with the content, which is all we were really after. Hopefully this helps others who are googling around for something more than -C before reading the man page like I always seem to do!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s