Homework Assignment #2 – Using strace to perform forensics on the Apache web server [50 pts]


You are expected to do your own work on all homework assignments. You may (and are encouraged to) engage in general discussions with your classmates regarding the assignments, but specific details of a solution, including the solution itself, must always be your own work. (See the statement of Academic Dishonesty on the Syllabus)

How to turn in?

Assignments need to be turned in via Laulima. Check the Syllabus for the late assignment policy for the course.

What to turn in?

You should turn in a single plain text file named README.txt with your answers to the assignment’s questions. NO PDF SUBMISSION ALLOWED.


Environment

For this assignment you only need to consult man pages. You can do so in your own Linux environment (see Assignment #0) using the man command. Linux man pages are also available on-line.


Overall objective

The pedagogic objective of this assignment is twofold:

  1. Understand strace output and be exposed to well-known syscalls

  2. Increase your familiarity with the Linux command-line. You can do everything in this assignment using these powerful and commonplace commands (help is available about these commands on-line, in man pages, on-line, and of course from your instructor and TA):

    • cat
    • grep
    • cut
    • wc
    • sort
    • sed
    • tail
    • uniq

Exercise #1: Apache web server forensics [10 pts]

A popular web server implementation is provided by the Apache HTTP Server Project. Your company had a machine that ran this web server, but eventually that machine got compromised and was permanently retired. You are tasked with performing some forensics, but unfortunately all log files have been lost. The only thing that remains is an strace output that the system administrator collected for about 6 minutes (the observation period) and saved before the machine was retired.

The strace output was collected from the server using the following command:

strace -f -x -o /tmp/apache2.strace -v -s 1024 -T -ttt apache2

You can use the strace -h command on Linux (or lookup the strace man page) to see the list of all command-line options to strace, so as to fully understand what the above command did when producing this 14,000-line plain text output.

Based on the above output, answer the questions hereafter.

Question #1: Counting syscalls [10 pts]

        101 pineapple
        230 orange
        356 banana
       1231 guava
       2902 mango

Question #2: Processes [10 pts]

Each line of the strace output starts with a PID (Process ID), which is a unique number associated to the process that invoked the system call on that line. The web server uses multiple processes during its execution. (We will talk more about PIDs later this semester.)

Question #3: PNG Headers [10 pts]

The web server responds to many HTTP GET requests for downloading image files in the PNG format. There is suspicion that the AWS-Logo-for-dark-150x150-1.png image file served by the web server was corrupted.

Question #4: Connected Clients [10 pts]

The web server answers requests sent by clients (e.g., web browsers) that run on various machines during the observation period. The main thing that a web server does is wait for a connection and then “accept” the connection to handle whatever request was sent. The name of the syscall to accept a connection is very intuitive.

Question #5: The 935.json File [10 pts]

Your boss, for some reason, is particularly interested in the file 935.json.