Linux (or Unix) command line utilities like awk, sort, uniq, can be used to analyze apache log to get interesting stats. One use case is to find top IP addresses hitting your web site. Here is handy command for that and its outcome on a sample data.
$ cat /var/log/apache2/access_log.2016-02-04 | awk '{print $1}' | sort | uniq -c | sort -rn | head
397 52.8.183.64
23 157.55.39.29
20 157.55.39.178
17 157.55.39.176
15 157.55.39.101
11 157.55.39.177
10 185.45.13.148
9 157.55.39.179
8 117.207.192.224
7 141.8.143.217
Note that the access file location is based on Apache installed on Ubuntu Linux.
Few points to note
- sort can handle fairly large amount of data even on low RAM machine.
uniq -cwill output unique entries with count. It works only on sorted data.- sort -rn does a reverse numeric sort