The organisational implications of using employee messaging for data science

Photo by Markus Spiske on Unsplash

Who is doing this?

A number of companies are already looking at email for valuable content. One of the first email datasets considered for use by data scientists is a corpus of emails from Enron corporation, leaked onto the internet after it was obtained by the Federal Energy Regulatory Commission during its investigation, and now a rich training dataset for data scientists looking to understand the schema and value of email data.

Advice for companies considering accessing email data

Regardless of whether you will be working with your own data or allowing a third party vendor to access your data, it is imperative that you communicate clearly with employees about the nature and value of any analysis of their emails. The UK Information Commissioner’s Office emphasises that employees have an expectation of privacy at work even when they have been informed that workplace monitoring may take place. Respect for employees should be of paramount concern, even over company profitability.

Analysing your own data

For companies seeking to unlock the value of your own data, ensure that your employee policies are clear and up to date. Where no third party is involved, your use is most likely covered by your right to monitor emails, however you should ensure that employees are informed of any monitoring before it takes place. ACAS has clear and simple guidelines for communication to employees:

  • Monitoring shouldn’t be excessive and should be justified.
  • Staff should be told what information will be recorded and how long it will be kept.
  • If employers monitor workers by collecting or using information the Data Protection Act will apply.
  • Information collected through monitoring should be kept secure.

Allowing access to a third party

If you are considering working with a third party vendor, things become increasingly complex. Email data is commercially sensitive, personally identifiable and private to the employee. Before allowing any access to emails by third parties, CTOs and data protection officers should give considerable thought to the real value being offered by the service.

As a company considering developing sensitive message analytics products

For most vendors of analytics services, the first challenge will be dealing with compliance and regulatory concerns on behalf of the client. This will often be a greater challenge than the actual analysis of the corpus. As my conversations with other CTOs demonstrated, it’s still easier to refuse access than to deal with the thorny problem of justifying access to email data.

Tips for potential vendors

If you are considering offering a service to analyse corporate messaging, here’s a summary of tips that you may find useful, especially when dealing with EU or global clients bound by GDPR.

  • Wherever possible ensure that processing is one time, rather than continuous
  • Wherever possible ensure that processing takes place within the perimeter of the client organisation — avoid moving data away from company systems (which may require a more consultative, rather than service based, approach)
  • If possible provide analytics tools to the organisation for their use. Only receive the analysed data from the tools, rather than the raw data to analyse.
  • Develop a chain of custody for client data
  • If machine learning is used for analysis, ensure that a way to demonstrate decisions made in the analysis are available to the client. This is especially important for non-rule based approaches (like those in neural networks).

Technologist, lean evangelist, chaos monkey and Chief Technology Prevention Officer. Loves good coffee, hanging around on ropes and driving about in cars

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store