Edit

We have recently chosen to stop running our honeypots and have deemed the short-term project a success. We gathered a wealth of information and plan to produce a length post breaking down this information and presenting our findings in one final state. Because of this, we have also chosen to remove our dashboard page - but the raw JSON data will remain on GitHub.

Introduction

Since starting HackYour.Tech, we have been working on a fun project in the background, specifically a network of various systems and stacks to help us collect, gather and process data from various aspects. Without going too off topic, one of these areas is honeypot data. We have deployed and exposed administrative services to the public internet to attract malicious actors and bots in an attempt to capture authentication data.

Whilst our collection of logs and data was building, we were simply just visualising our data in Kibana on private dashboards - but the whole point of this blog was to hopefully bring value to our readers, so we put our heads together and have ultimately come up with a public facing dashboard that reports our gathered data too.

First Public Release

As of today, our first iteration of this dashboard has been included on this application as a single page addition. We have built a page directly into our Hugo theme that changes and populates itself dynamically based on user input choices. For those who are well versed in fancy dashboards, we want to take this moment and state that this isn’t one of those fancy ones. We are currently just working with tables to present the top common values across 4 different fields.

These fields are:

  • Usernames
  • Passwords
  • IP addresses
  • Locations

For each field, we have pulled all data from our database and calculated how many times each one appears - ultimately presenting the most common values publicly. In terms of ranges of data, we have implemented a toggle that flips between ‘all time’ data and data relative to a selected week.

The raw data is also publicly available on our GitHub account. Please feel free to check it out in its raw format if you want to. In addition, we have a service running in the backend that processes the same data but helps generate wordlists of all usernames and password values that are being collected. It’s worth noting that this generation of unique values has slowed down which indicates these authentication attempts are likely all using the same wordlists themselves - however, we plan to do a deeper dive into this and try and narrow down common lists they may be using. For those who are curious, our generated wordlists can be seen on this GitHub Repo.

Future Plans

We plan to deploy more honeypots, with different services, exposing different elements to hopefully attract more humans/bots, attempting to hack us, but ultimately giving us more data to analyse and work with. The dashboard will likely adapt to this allowing us to filter specific service data and hopefully present the data in different ways, including things such as maps, trends and patterns.

Bare with us, we are ultimately waiting for enough data to work with which will give us opportunities to work on new versions and iterations. Fingers crossed we get some juicy data to try and play with!