Graphical Analysis

Website: https://boring-perlman-cf3505.netlify.com/

One of my side projects, this graphical analysis depicts a fictitious company who keeps track of employee that miss hours on a daily basis.

Scenario

Every day at 9:30am an email is sent out showing the employees who missed time the day before. The basic format of the email looks like this:

John Doe - 6/14 - 8 Hours
Jane Doe - 6/14 - 2.5 Hours
1
2

Equipped with this format, and an internal motivation to learn some new tech, I went and created a graphical analysis representation of this data.

Technology Stack

Python

Ever since I became interested in server penetration testing, I've always wanted to learn python. If I had to choose, I'd definitely be on the blue team.

Usage

I created a python script to accomplish the following tasks:

  1. Authenticate with the Python Gmail API
  2. Scrape all missing time emails using a regex
  3. Write the data to a file to be used for the frontend of the site

Code Sample

Here's a code sample of my Results Class:

import sys
import json
from results.utils import get_file
from results.decorators import singleton
from results.employees import Employees
from dateutil import parser as date_parser
from datetime import timedelta

@singleton
class Results:
    def __init__(self):
        self.filenames = {
            'results': 'results.json',
        }
        self.collection = get_file(self.filenames['results'])
        self.last_updated = self.get_last_updated()

    def get_last_updated(self):
        if self.collection is None:
            return '2019-01-01'

        last_updated = self.collection[-1]['date']
        last_updated = date_parser.parse(last_updated)
        last_updated = last_updated + timedelta(days=1)
        last_updated = str(last_updated.date())

        return last_updated

    def add(self, results):
        employees = Employees()
        results = list(map(employees.normalize_results, results))

        should_append = self.collection is not None
        new_results_added = len(results)

        if (should_append):
            self.collection = self.collection + results

        else:
            self.collection = results

            new_results_added = len(self.collection)

        print('===\n')
        print("%i new result(s) added" % new_results_added)
        print("%i total results" % len(self.collection))
        print('\n===')

    def commit(self):
        with open(self.filenames['results'], 'w') as file:
            json.dump(self.collection, file)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

Regex

I learned a lot about non-capturing groups, named capture groups, & useful regex tidbits. The Regex 101 website was an invaluable resource.

The regex for the python script went through many iterations, but here is the final result:

body_regex = re.compile(
    r'(?P<first_name>[a-zA-Z]+)[ \t.*](?P<last_name>[a-zA-Z]+)(?:[ \t.*])?-?(?:[ \t.*]*)(?P<date_one>[0-9]\d*(?:(?:\.|\/)\d+)?)(?:[ \t.*])?-?(?:[ \t.*]*)(?P<hours_one>[0-9]\d*(?:(?:\.|\/)\d+)?)(?:[ \t.*]*)(?:(?:H|h)ours?)(?:(?:[ \t&]+)(?P<date_two>[0-9]\d*(?:(?:\.|\/)\d+)?)(?:[ \t.*])?-?(?:[ \t.*]*)(?P<hours_two>[0-9]\d*(?:(?:\.|\/)\d+)?)(?:[ \t.*])(?:H|h)ours?)?', re.MULTILINE
)
1
2
3

Body Regex

So why did this regex get so complicated? Each week, a new format might have been added (since the emails were written by hand), which would invalidate the previous regex. In the end, I had to match all of the following:

FirstName LastName - 3/17 - 2 Hours & 3/18 - 8 Hours
FirstName LastName - 3/18 - 40 Hours
FirstName LastName - 3/18 - 40 Hour
FirstName LastName-3/18-40 Hours
FirstName LastName-3/18-40 Hour
FirstName LastName 3/18 - 8 Hours
FirstName LastName 3/18 8 Hours
FirstName LastName 3/18 1/2 Hours
1
2
3
4
5
6
7
8

Nuxt.js

I chose Nuxt.js for fun. I hadn't used it before and wanted to learn a Vue.js framework. For state management, I used Vuex, which comes out of the box with Nuxt.js.

I relied heavily on higher order functions to create complex relationships between hours, departments, & people.

For example, on the Email Send by Time Chart, here's a code sample of everything working together:

Vuex

// store/misc.js

export const getters = {
  /**
   * the top level getter
   * used by other getters to
   * create more specific data sets
   */
  analysis: (state, getters, rootState, rootGetters) => {
    let results = rootState.results.map((result) => {
      let expectedDate = dayjs(result.date).format('MM/DD/YYYY 9:30:00');
      let actualDate = dayjs(result.date).format('MM/DD/YYYY hh:mm:ss');

      return {
        senderName: rootGetters['person/get'](result.from).fullName,
        date: new Date(result.date),
        difference: dayjs(actualDate).diff(dayjs(expectedDate), 'minute'),
      };
    });

    let bestDifference = Math.min(...results.map((r) => r.difference));
    let worstDifference = Math.max(...results.map((r) => r.difference));

    let bestDays = results.filter((r) => r.difference == bestDifference) || [];
    let worstDays =
      results.filter((r) => r.difference == worstDifference) || [];

    return {
      results,
      stats: {
        bestTime: dayjs(bestDays[0].date).format('hh:mm A'),
        worstTime: dayjs(worstDays[0].date).format('hh:mm A'),
        bestDifference,
        worstDifference,
        bestDays: bestDays.map((d) =>
          dayjs(d.date).format('(ddd) MMM Do, YYYY hh:mm A')
        ),
        worstDays: worstDays.map((d) =>
          dayjs(d.date).format('(ddd) MMM Do, YYYY hh:mm A')
        ),
      },
    };
  },
  /**
   * this getter is used for the actual graph
   * which uses the `analysis` getter from above to
   * transform the data into a format that google charts wants
   */
  emailSendByTime: (state, getters) => {
    return getters.analysis.results.map((result, index) => [
      new Date(result.date),
      result.difference,
      `
        <div class="p-2">
            <span style="white-space: nowrap;">
                <strong>Date:</strong> ${dayjs(result.date).format(
                  '(ddd) MMM Do, YYYY'
                )}
            </span><br />
            <span style="white-space: nowrap;">
                <strong>Time:</strong> ${dayjs(result.date).format('hh:mm A')}
            </span><br />
            <span style="white-space: nowrap;">
                <strong>Delay:</strong> ${result.difference} minutes
            </span>
        </div>
    `,
    ]);
  },
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70

Template

<GChart v-bind="attrs" />
1

Script

// components/misc/ChartEmailSendByTime.vue

import { mapGetters } from 'vuex';

export default {
  computed: {
    ...mapGetters({
      rows: 'misc/emailSendByTime',
    }),
    chartData() {
      let headers = [
        { type: 'date', label: 'Date' },
        { type: 'number', label: 'Send Delay (minutes)' },
        { type: 'string', role: 'tooltip', p: { html: true } },
      ];

      return [headers, ...this.rows];
    },
    attrs() {
      return {
        type: 'LineChart',
        settings: { packages: ['line', 'corechart'] },
        data: this.chartData,
        options: {
          toggleSidebar: this.toggleSidebar,
          title: 'Email Send By Time',
          chartArea: {
            top: 70,
            right: 25,
            bottom: 100,
            left: 100,
          },
          height: 600,
          width: '100%',
          tooltip: { isHtml: true },
          legend: { position: 'top' },
          hAxis: {
            title: 'Date',
            format: 'M/d/yy',
            slantedText: true,
            slantedTextAngle: 45,
          },
          vAxis: {
            title: 'Send Delay (minutes)',
          },
          series: {
            0: { color: '#27c30d' },
          },
          trendlines: {
            0: {
              color: '#444444',
              labelInLegend: 'Trendline',
              visibleInLegend: true,
              tooltip: false,
            },
          },
          crosshair: { trigger: 'both', orientation: 'both' },
          lineWidth: 3,
        },
      };
    },
  },
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63

Google Charts

I used the Vue Google Charts plugin to have a good starting point with setup & reactivity.

It was a lot of fun researching & picking out charts to use on the app. I went mostly with line & bar charts, since it fit the data most appropriately. My favorite graph was the Sankey Diagram, which I used on the People Flow Chart.

People Flow Chart