Monday, June 24, 2024

Applying ML in Cybersecurity – TechBullion


Trends and Insights from Aleksandr Timashov, a Leading ML Expert

Meet Aleksandr, an ML Engineer with over a decade of experience in AI and Machine Learning. A graduate of Lomonosov Moscow State University and Stanford University, Aleksandr has applied his expertise in various industries including e-commerce, oil & gas, and fintech. Currently, he engineers malware detection systems at Meta. With a background in AI, Deep Learning, and NLP, Aleksandr agreed to share his insights into present-day cybersecurity challenges. His motto is “Training AI models to make this world better!”

First of all, could you please tell our readers about yourself? You have worked across a wide range of industries, each presenting unique challenges. What has driven your passion for applying machine learning to diverse fields, and how has this experience influenced your current field of work?

Thanks, I am excited to share my journey! Although frankly, my story may sound similar to the stories of many of your readers. It all began with a childhood love for mathematics that, over the years, has flourished into practical applications in machine learning across a spectrum of global industries. From my school days, I was attracted to numbers and patterns, and this passion in the end determined my way in life. After I became a Software Engineer, my early interest in linear algebra, probability theory, and programming was crucial in helping me tackle the diverse challenges I encountered in my career.

My professional path took me from Russia to Malaysia to the UK and spanned such sectors as finance, e-commerce, oil & gas, cybersecurity, and now – integrity. Each experience has helped me hone my technical skills and deepened my understanding of different market dynamics and cultural approaches.

In finance, I developed scoring algorithms that enhanced decision-making and reduced risks. In e-commerce, I crafted algorithms capable of comparing millions of items in real time, significantly improving product discovery and enhancing customer experiences. In the oil & gas sector, I led a computer vision team that fortified security and automated inspections, boosting efficiency and achieving considerable cost savings. Currently, in the integrity field, I leverage my machine learning knowledge to ensure compliance and uphold ethical standards. My broad experience helps me promote transparent and reliable processes across industries.

One consistent thread throughout my career has been the palpable impact of my work. Whether optimizing operations, enhancing user experiences, or ensuring safety, I always got deep satisfaction from witnessing the positive outcomes of my efforts. I am enthusiastic about continuing to use my skills at Meta to foster integrity and transparency in business practices, striving for a more ethical and, at the same time, more efficient society.

We know you probably cannot disclose much about your current work at Meta. Still, without violating your NDA, could you share with us the most intriguing technical challenges you’ve personally encountered while building malware detection systems?

You know, perhaps the most intriguing aspect of working at Meta is the sheer scale at which we operate. I am genuinely excited by the challenge of applying ML algorithms to protect hundreds of millions of users from malware. The company deals with vast amounts of data and a variety of constantly evolving threats, which requires us to continuously advance our algorithms to detect and mitigate potential risks. We have to make sure our systems are both efficient and scalable while maintaining user privacy. This not only helps in safeguarding user data but also guarantees that they are not exposed to harmful content.

For me, it’s incredibly rewarding to see the direct impact of my work on such a large scale and to see myself contributing to a safer global online environment. And of course, Meta provides me with a plethora of opportunities to further develop my skills, as I deal every day with the most cutting-edge ML technologies, including such fascinating ones as Semi- and Self-Supervised Learning, and GNNs.

You have mentioned Graph Neural Networks, and I know that in the last five years, this tech has been all the hype in cybersecurity. Could you tell our readers about GNNs and how they can be leveraged for malware detection?

Certainly, it’s actually one of my favorite topics! Just in case some of your readers don’t use GNNs yet and need some general information on them: Graph Neural Networks are a type of neural network used to capture dependencies within graph-structured data. GNNs are used to model complex relationships and interactions, which is crucial for analyzing topological patterns in graph-structured data.

Specifically, in cybersecurity, GNNs find various applications. They are a great tool for detecting unusual patterns within complex structures like function call graphs, patterns that can signal malicious activity. Beyond that, GNNs play a major role in uncovering complex attack patterns in network traffic. You can model network traffic as a graph where nodes are users and/or devices and edges are interactions or data flows, and GNNs can help you catch anomalies that suggest cyber threats like DDoS attacks. GNNs are also used in social network analysis to identify coordinated misinformation campaigns or malicious social bots – although this doesn’t quite fall within the scope of my work, so I’m not going to go into detail on this topic.

GNNs and their role in cybersecurity is actually quite a complex topic, and if your readers are interested in practical applications of GNNs, I recommend reading a captivating study “Graph Neural Network-based Android Malware Classification with Jumping Knowledge” by M. Gallagher, M. Portmann, and others. In short, the authors showed how GNNs can classify malware in Android Application Packages by exploiting the rich relational data within these graphs. This approach makes it possible to find and analyze sophisticated malware threats not detectable with traditional detection systems. I believe this technology has the potential to dramatically enhance the cybersecurity measures available to us in the near future.

In your speech at the recent ML and Data Engineering conference apply() Spring ‘24, you talked about Semi-Supervised Learning. Can you tell our readers more about this technique? How does SSL help ML Engineers deal with the challenges of cybersecurity?

In Machine Learning, Semi-Supervised Learning is a training technique that involves combining a small amount of labeled data with a large volume of unlabeled data. It’s a truly fascinating area, especially when applied to cybersecurity. In this line of work, it is often difficult to obtain so-called ground truth data for every potential threat. We deal mostly with unlabeled data, and this is where SSL shines!

To give you an example, SSL is used to enhance the detection of zero-day exploits. We train models on a mixture of known threats (labeled data) and a larger pool of unknown or new data. With this technique, our models learn about new threats that don’t exactly match any already known patterns but show suspicious characteristics.

SSL is also used in behavioral biometrics for continuous authentication systems. These systems analyze patterns of user behavior (think: typing rhythm, mouse movements, etc.) With SSL, you can develop models that adapt to the minor variations in a person’s behavior over time without needing explicit labels for every data point, which would be simply impractical to collect.

For those interested in exploring more about this technique, there is a recording of that speech you’ve mentioned where I delve deeper into the use of SSL in cybersecurity. In the speech, I explained the matter in much more detail than I’d be able to do in this format.

[Interviewer’s Note: We will include a link to the recording of the speech below this interview for those who are interested.]

Thank you for highlighting that, and I’m glad to be able to share these insights with your readers!

Meanwhile, let us turn to a less technical aspect of your work. You have told us about data science teams that you built and mentored. What approaches to team building did you find the most effective in your line of work?

Of course, although, in truth, there is no recipe specific to my line of work, it’s rather a universal approach that can be applied in any area or industry. In my experience, to foster a culture of innovation and collaboration – which is part of any modern leader’s job – one has to start by identifying genuinely passionate people in your team. People are the most efficient and productive when they are doing the work they love. To cultivate this environment, especially in a challenging field like ML, I focus on building teams with members who are naturally curious and deeply interested in continuously learning new things, elevating existing skills, and acquiring new ones.

Once the team is assembled, I give priority to establishing a foundation of trust and open communication. I encourage team members to share their ideas freely and challenge conventional thinking without fear of judgment. This openness leads to a more innovative atmosphere where creative solutions can emerge.

To give you an example of a specific approach I personally found effective, I organize team members into pairs or small workgroups, promoting collaboration. Not only does this approach help in blending diverse skill sets to achieve better results, but it also fosters stronger working relationships, which is crucial for solving complex issues. Working closely together, team members can build on each other’s strengths and push the boundaries of what they can achieve on their own.

Given the rapid evolution of cybersecurity, how do you keep up with the latest developments while ensuring practical application and innovative strategies in your work?

To keep up and stay relevant in the dynamic field of cybersecurity, I read a lot of professional media, like Malwarebytes for example, and attend industry events. I also recently enhanced my expertise by completing a graduate certificate in Artificial Intelligence at Stanford. These activities not only deepen my knowledge but also directly inform the innovative strategies I implement in my work. This way, I ensure I remain at the forefront of the current technological advancements.

Looking ahead, which emerging AI trends or techniques do you believe will shape the future of cybersecurity and malware detection, and what advice would you give to data scientists to prepare for these changes?

I expect that our industry will increasingly embrace deep learning, mirroring trends we see in other sectors. For instance, even today, malware detection still in large part relies on hand-crafted features, and this keeps limiting our understanding of the underlying data. For use cases where source code is available, I anticipate a significant shift towards utilizing Large Language Models and other sophisticated deep learning architectures. I believe this approach will dramatically enhance our ability to detect and analyze malware.

As for data scientists who aim to stay at the forefront of these changes, my advice is – to deepen your understanding of deep learning technologies. Keep yourselves up-to-date with the latest research and practical applications in AI, experiment hands-on with available code, and engage in collaborative projects. Given the high involvement of subject matter experts, maintaining a collaborative mindset is essential for a thorough understanding of our constantly evolving field. Embrace these emerging trends, and you will not only enhance your analytical capabilities but will also equip yourselves with the tools necessary to effectively tackle advanced cybersecurity challenges.

Link to the recording: 

Semi-Supervised Learning. How to Overcome the Lack of Labels with Aleksandr Timashov, MI Engineer at Meta

Source link

Muhammad Burhan (Admin)
Hi, I'm Muhammad Burhan. I'm a tech blogger and content writer who is here to help you stay up to date with the latest advancements in technology. We cover everything from the newest gadgets, software trends, and even industry news! Our unique approach combines user-friendly explanations of complex topics with concise summaries that make it easy for you to understand how technologies can help improve your life.



Related Stories