Jonas Christensen 2:41
Alex Antic, welcome to Leaders of Analytics. It is so great to have you on the show.
Alex Antic 2:47
Thank you, Jonas. Absolute pleasure.
Jonas Christensen 2:49
I have been looking forward to this conversation for a few weeks now, since we organised it. You've just told me that this is the highlight of your day. So we're in for a tough conversation, I'm sure. And Alex, I've obviously done my research on you and found a very, very impressive resume that you have managed to establish over the last 25 years. And we'll explore that more. But before I talk about that, we should hear straight from you. So could you tell us a little bit about yourself, your career background and what you do?
Alex Antic 3:21
My name is Alex. I'm a recovering quant and ageing mathematician and a reformed academic. I think that's probably the easiest way to sum up who I am. Having grown up watching spaghetti westerns, I like to think of myself as a data science gun for hire. But in reality, it's a little bit more boring. I run my own consulting, advisor and training business, which allows me to actually help people in industry, government, academia and startups with a data science and AI challenges and it's a lot of fun. I love what I do. I also really enjoy presenting keynotes, workshops and conferences and industry events both here and abroad. Appearing on podcasts such as this, and in my free time I'm spending some time writing book. Co-authoring some textbooks and even working on short film ideas. I just can't get my hands off the world of AI analytics and can't help but, I guess, share my love for. But if I can go back a bit, many years ago, after studying maths and computing at uni, I started my career as a quant as you mentioned. Which I consider to be the original data scientists. And was lucky to actually work with some amazing people at places that included Macquarie Bank, Commonwealth Bank and Colonial First State. Was also very lucky at the time to meet one of the godfathers of quantitative finance: Myron Scholes, one of the pioneers of the Black Scholes equation. But after many years of working in a quant space, I ended up moving into the government sector, just as my interest in the notion of data science for social good really increased, and also around the same time that data science was becoming established as a field. About a decade ago now. And since then, my career has included working as a consultant in academics. It's been quite broad. But however, the common thread throughout my entire career has been centred around this notion of using data and analytics to help solve complex and challenging problems and to really just help others see the benefit of using data and AI. That gives you kind of a rundown on who I am and how I've got to where I am today.
Jonas Christensen 5:04
Yeah, very interesting. And we probably have a similar academic background, if I may say so, in terms of starting out with that sort of quantitative finance debt realm. That's what I studied as well, when I was just a young man. I do remember the Black and Scholes formula, having practised that into late nights before exams. You have a much more impressive CV than I do some years later, I must say. And you've already mentioned how you've also worked in academia for many years, and a lot of roles across government and also board roles and advisory roles. We need to explore those, of course, but how did you get into data science in your career? Sort of, what was the moment where you went ''Okay, this quant stuff, It actually is good But there's something here that I really need to pursue?''
Alex Antic 5:49
Sure. Yeah, great question. From very early age, I had a real interest and passion for maths and its ability to try and make sense of the world through logic and quantitative techniques. And just as computers were becoming the norm and an integral part of our lives, I thought that studying both maths and computing would be fun and possibly useful at some point. Little did I know at the time, just how integral that combination would be to the future of analytics. My love of mathematics eventually led me to pursue an Honours degree in pure maths and a PhD in applied. So the eventual transition to what is known as data science these days, felt completely natural to me, given its dependence on my training in maths, stats and coding. And as my career developed, my passion for helping others, especially at a strategic level really allowed me to bring together all that I've learned to help organisations innovate, and then deliver impact and change. But sometimes I feel like I actually work work in the field of data politics, not data science. I mean, as I'm sure you can understand. Once your seniority grows, you quickly realise that the data science is not so much about the technology. It's really about the people and managing people's expectations of it, and trying to juggle all the complications beyond the technology itself. So, those parts have been a challenge. But at the same time, it's been incredibly rewarding to work in this field of data science and AI, and hopefully make a difference in some small way.
Jonas Christensen 6:59
So, what are some of these typical challenges that you see in, what you call, the data quality space? What are some of the daily obstacles there for self and similarly, leaders to deal with?
Unknown Speaker 7:08
Often it comes down to: I think, what is a lack of data literacy amongst senior executives and decision makers. Them not really having, I guess, realistic expectations of what you can and can't do with data analytics. There may have been sold a dream by a vendor or see some great presentation on AI and its benefits and don't really realise the reality of setting up the capability. What is needed in terms of having the right people, the right infrastructure, the right tools, the right support, and ultimately, the right culture to support it. I think they often think, you know, you just throw some money at it, bring on a data scientist or a data engineer, and suddenly all your problems will be solved. But it's much more nuanced. You obviously would know as well, in terms of how do you formulate the questions in the first place? How do you make sure there's clear alignments between the technology and the strategic goals you're trying to actually solve? How to actually develop the right culture to begin with, that actually benefits and believes in the power of data driven, evidence based decision making? And that's where I see a lot really struggle. And then there are elements of they've invested in data science. They've got some great, you know, proof of concepts. But then often, there's a hurdle they face when it comes to productionizing those models. So, there are a number of points that are, I think. I've seen a common issue where a lot of organisations can struggle to really scale up their data science capability. I think having the right leadership, having people in the CDA, CTO roles, who can really help guide the board and senior executives down the right path.
Jonas Christensen 8:30
Yeah, very interesting. And you sort of make me reflect on the last 40 years of history of these sorts of things playing out more than once. So if you go back - or perhaps 30 years - if we go back to sort of mid 90s, there were a lot of executives that didn't have any computer experience. They weren't computer literate as such. So for that generation, it would have been hard to accept that things were being computerised and digitised. Fast forward to , sort of 2010, call it, then all those executives are computer native. And fast forward to today, everyone on executive committees will have had the internet in their pocket for quite a long time. That maturity is there. And therefore people have the direct experience with those solutions and products and they can make informed decisions. We're not there yet with data and data science. And I think it will take another sort of 5 - 10 years for people to really bite onto it. So, that's kind of how I reflect on what your comment there is. Is that something you can relate to?
Alex Antic 9:28
Very much. I agree that you have to get to the point where it is a common part of the conversations that senior executives have when it comes to actually driving decision making. I think that a lot of them are becoming aware of some of the gaps that exist in their own understanding and that of their teams and colleagues. But I haven't seen enough commitment broadly, to them upping their education, their understanding of a lot of the technical and broader aspects of AI, to really make a difference. However, I'm seeing that change, especially as regulation. And, I guess, demand from citizens continues to increase around, you know, explainability, responsible use of AI. I think a lot of that will actually drive positive change in the coming years. I'm really optimistic about that.
Jonas Christensen 10:12
You mentioned here consumer expectations and also legislation. And I think that brings us to the main topic that we want to discuss today, which is how can we use data to build a better world. So there's a lot of risk associated with the ascent of AI, and so on, and also a lot of doomsday stories out there. But AI has the possibility to be doing a lot of good for humanity as well. I'm sure you have a lot of experience in that space too, as someone who has worked across many government departments. So, with your varied experience across enterprise, consulting, government, etc, academia, I'm interested in your view on the role data science and AI should and can play in society.
Alex Antic 10:58
An important lesson that I've learned throughout my career is that data science and AI, and tech more broadly, is just a means to an end. It exists to really help us. I really believe about the power of data driven decision-making, in which data analytics are used to provide evidence-based support for better informed decisions. It might not always help make decision-making easier, but I think it always helps make people make better decisions, which is really, I think, the core of it. I feel we're at a pivotal point in history, where we can take control of how tech will continue to shape our lives, especially in relation to the ethical and responsible use of AI that we just touched on. And given the vast amounts of data being generated and used to drive decision-making across many different industries and sectors. I think we have no choice but to really leverage and embrace data science and AI to help us make much better, much more informed decisions. However, at the same time, we really need to be careful with how we do this, and what bounds and regulation we put in place to support this. You just need to look at the abuses of data analytics, even potentially subversive ones. Examples such as Cambridge Analytical, and many more recent examples. So just how quickly and easily things can go wrong, especially at scale. One of the huge benefits of AI is the scalability, but also that can be a detriment as well. However, overall, I think it's really fantastic to see how much societal discourse there is on how and why tech is being used to influence our lives for social good. And I think this really needs to continue at all levels to ensure that we create a society that successfully leverages the benefits while also mitigating the risks. And there are many, many positive things happening both here and abroad, around data privacy, data sharing, you know, different regulation and processes put in place. And a lot of, I guess, negativity that's been associated with some of the big tech giants in recent years. It's really helped people, I think, become much more aware of what's happening. How it affects them personally, once they realised all the tech that's running on their phone. You know, all these amazing, deep learning neural networks that are just helping aid their life day-to-day, and just how quickly and easily they share their information. And then hopefully realising, you know, that can come at a cost. You know, you're giving away this information for free. There can be a downside to that, rather than just seeing spam all the time on your phone or on your computer. So I think people are becoming much more cluey to the pros and cons, and much more careful. Much more willing to stand up and say ''We want have self setting in terms of how it's being used''. I think that's really good. That's great to have these conversations. And it's not happening just in one or two countries. It's really an international movement, which I think can only be a positive thing for everyone.
Jonas Christensen 13:20
Yeah, absolutely. I think the thing to reflect on with social media is that: It's free because we're the product. We're not the users.
Alex Antic 13:29
Exactly.
Jonas Christensen 13:31
But we're talking about AI being used for social good here. So could you give us some examples of how AI is being used for social good?
Alex Antic 13:38
So many examples we could talk about this afternoon. I think in a broader data driven sense, one that's topical, and it's a good example to begin with, I think it's COVID itself. Many people who don't normally deal with data analytics realise just how important data is. In a way, it's helped them clearly seen rapid growth in case numbers and spread of the pandemic, not to mention various apps up and running on their phone to help with tracking cases and alerts. Beyond that, it's helped them also realise just how people, organisations and countries can work together to really make a difference by leveraging data. Where I'm seeing real valuable and exciting use of data and AI to drive social good is in the broader context for data sharing. And not just the use of sexy AI algorithms, though, we hear so much about. The data sharing such as: What government agencies do between one another and data sharing that occurs also between government and industry, with the aim of supporting social good. Examples can include, you know, improving health care, such as particular cases where we're trying to identify those that are in need through proactive intervention. By being able to share data between government, medical providers, and maybe even social media. To be able to offer people who need it most. Personal, tailored, medical treatments. I think those cases are really exciting and ones that don't get much AirPlay at times. Beyond this, we've also got these common examples of data sharing being used to help fight crime and fraud and detect stuff risk in real time. It just gives law enforcement abilities that were previously impossible, especially at scale and in real time. Other examples can include, you know, using AI for crisis management, such as predicting spread of bushfires or other types of natural and man-made events. Adaptive learning for early education: trying to do tailored learning regimes for young children that are better suited to their learning capabilities and strengths. Tackling environmental and agricultural changes. Tracking health in real time based on wearables, and many other examples from the public services sector. Overall, though, I think it's really fantastic that many organisations and governments are jumping on the bandwagon of using AI to try and actually drive it for the broader social good by investing in it. I think that's a really important thing. Investing in technology and the capability and then building the right teams and processes to actually move it forward and in the right direction.
Jonas Christensen 15:43
So, you have worked in several Australian federal government departments, implementing AI solutions, overseeing those solutions, and whatever it takes to get them stood up. In terms of whatever you comfortable sharing or whatever you can share: Could you give us some concrete example of something you stood up with the team there and the challenges you faced, I suppose, challenges that are typical, but also ones that are more unique to a government or public entity.
Alex Antic 16:12
Yes, it's probably also worth talking about the use of AI and processes and practices more broadly used in the public services space. Just to give people a better understanding of how it might differ from, say, the private sector. Overall, there's a great deal of focus that's placed on safe and secure capture and storage of data with clear ethical guidelines in place on what can be captured and for what purpose. These specific guidelines and frameworks also dictate how the use of data for specific outcomes such as data matching and data identification, the identification of data sharing is used all within the bounds of regulation and legislation. And then surrounding all that there's, you know, security clearances, key personnel, roles and responsibilities, which limit what data they can access. Security classifications, or the systems themselves. Having air lock systems, which are not connected to the net, but hold, you know, sensitive information. And then this notion of augmented decision-making. Making sure there's humans in the loop, and that you don't have completely automated decisions, which can potentially have harmful effects on those citizens that we're trying to serve. I think it's also really important that people understand that, especially within the public service space, there's a fine balance that exists between social licence. What society allows you to get away with in terms of using their data and technology to make a difference, the regulation and legislation that surrounds all this, and also social good. What you're trying to do: You're actually driving society forward. As technology tends to move faster than regulation, the challenge often becomes: how to leverage current and emerging tech to create societal impact and change whilst remaining within the bounds of any governing strict regulation legislation? So as an example, I previously was lucky enough to work in a government agency where I helped kick off an exciting project that use this idea of privacy enhancing technology, and was in partnership with private sector organisations as well. All of whom had the same aim in mind and that was to help detect and deter financial crime, especially at scale and things like money laundering and terrorism financing. And this was really a world first initiative. And it was a fantastic technology that allowed us to work within these bounds and to use this technology without having to worry about breaching any privacy and data sharing laws. And this notion of privacy enhancing technology is really, I think, an important topic that will become much more popular in the future. And so that's really close to my heart from a research perspective and an application perspective I've worked in: In that it's designed to ensure data privacy, by allowing us to work with data without actually seeing it. It allows us to protect data in use, and not just at rest or in transit. And I really think that the future for privacy enhancing technology, beyond also the public sector, is really exciting. I think because I think we're at the cusp of a whole new era of data sharing potential, as it allows us to leverage cutting-edge tech such as homomorphic encryption and differential privacy to tackle challenges which were previously impossible. And some of the tech giants such as Google and Facebook already using this tech. You might not even notice it running on your phone. Applications have potential in financial services, healthcare and cyber. I think, in particular, it's very exciting. And especially in relation to social good. So when it comes to actually - I mean, you asked about standing up some of these capabilities in government and running these - you know, often you're faced with the same hurdles you'll face in any organisation. It's winning funding and support initially. Making sure you can clearly align the technology, the solution you're trying to create to specific strategic outcomes. Making a strong case for that. That often, you know, means you have to not only articulate that well, but, you know, there's a whole journey you go on to get the right people on board and make sure that they understand what the key benefits are. Going through proof of concepts and helping people realise that if they don't invest in this, what are the adverse outcomes that can result. So beyond winning the funding, it's being able to build the right teams, setting the right people in place, having, you know, not just technical staff but having the right legal representation. Having project managed these people with a data governance and data ethics background and making sure that you really manage those relationships, both internally and with external organisations, given this Public-Private Partnerships. Often in these cases, working in an agile way to make sure that there's constant conversations with how things are progressing and making sure that you're always clearly aligned with strategic goals. Overarching, all that is having the right leadership, which I think is, kind of, one of the most crucial elements. No matter if you're in public or private sector, there's always the same issues there. But on the public sector side, often you feel like you're being held to a higher standard, because you're privy to so much sensitive information. They have so much power in a way, that having the right leadership to navigate and explain that is actually quite pivotal and helps projects like this become very successful.
Jonas Christensen 20:33
Yeah, you're really making me think about this type of information that sits within government entities. And it's often linked to some sort of government identifier, right? So you have some sort of national ID or in Australia, we talk about tax file numbers, or Medicare numbers. In other countries, they might have a social security number. That's similar, which is the number that signifies you as a person. With that comes the keys to someone's actual identity to some extent. You can do a lot of damage to an individual and a whole group of people if that data gets in the wrong hands. And you're talking about these sort of privacy enhancing technologies. To bring that down to earth and get some clarity on exactly what these technologies do, would you be able to give some examples of where they might prevent some data from getting out or stop somebody in the tracks from doing the wrong thing or however they work?
Alex Antic 21:24
Sure. So, we'll talk about two specific examples which come up often. The notion of differential privacy allows you to be able to effectively use data, which is anonymized at an aggregate level. So you can work on this notion of having a pool of data together that you can release. That previously would have been done through, say the identification, which has high risks. But have some mathematical certainty around the reduced risk of anyone being able to identify any particular individual who's involved in this particular data set. So, there's many examples where people are able to share data to disseminate data in a broad sense, without having to really worry about, I guess, privacy leaks. Often the notion of differential privacy is used for aggregate level data sharing and analysis. And it tends to be much, much more robust against the anonymization and linkage attacks based on more traditional de-identification processes. And I guess, technically speaking, it's based on this notion of adding statistical noise in a measured way to a dataset. And then being able to back that out when you want to look at specific information. So there's a lot of cases that it can be useful: In the health sector, in just managing people's privacy between tech vendors and yourself in terms of using it on your phone. And when it comes down to its usage, it's ultimately comes down to a trade off between privacy and actual accuracy that users must guarantee. And you have some control over the way that's formulated mathematically. Example of use cases, you know: An organisation is releasing a large data set for public research such as, you know: Someone holding medical information, they want to release it more broadly, or they want to combat the risk of re-identifications. So differential privacy can be fantastic use case for that. Google used that in relation to COVID. Looking at Community mobility reports, helping people gain insight on the spread of COVID without having to worry about individual information and movements being held at risk. And there are ways you can also use to train machine learning models and private data. Things like differentially private Stochastic gradient descent actually is one that comes to mind, which can be fantastic when it comes to using it for that. So it's typically best suited to analysing broad trends rather than detailed analysis. Now, on the other hand, if you're looking at trying to use cutting-edge privacy enhancing technology techniques on an individual data, rather than aggregate data, then you'd probably want to look at techniques like homomorphic encryption, which I think are really exciting. And they've only in recent news become computationally feasible. They can kind of blow up your dataset, once you start encrypting the data. The beauty of them is that you can encrypt your whole data set and then run on machine learning or analysis on top of encrypted data, which you could never do before under any other techniques. And there's different levels in terms of what type of homomorphic encryption technique you use, that allows you to do either a full range of operations on your data or only a subset of that. So that's fantastic. I think application of this is something I see increasing in popularity in the coming year or two. At its core, data remains encrypted in memory, during processing, and at rest. An example of this could be: A vendor's proprietary machine learning model is trained on your encrypted data, and only you can decrypt the results. So that's effectively privacy preserving machine learning. But it does come with challenges. It can be very computationally expensive, and limit practical use and scalability. And it can also have restricted limited operations due to inefficiencies. So it can limit the type of machine learning you can actually do on your data. And sometimes there can be issues related to data quality, integrity and suitability and others. But overall, it's an incredibly powerful technique that as it becomes much more computationally viable. I think more and more organisations will see the benefit of just - you can just imagine - you've got your data encrypted sitting on your server. It gives someone access to an API. They can create machine learning models or do analysis on top of that. You don't have to effectively worry about, you know, any sort of data leakage, even in the quantum world. So I think that is an area that's super exciting in years to come and one that I'm really passionate about. So, hopefully I can give you an idea of some of the prevailing techniques in this realm of privacy enhancing technology and how they can be used and are being used these days.
Jonas Christensen 25:21
Yeah, it's very fascinating. And it's making me reflect on a couple of conversations I've had on the podcast with Shalini Kurapati and Minhaaj Rehman, where we talked about synthetic data. And synthetic data can be used to some extent to reduce the risk of datasets. But this way of encrypting and decrypting models is really fascinating. And I must confess that I know very little about it, but it's definitely an area that I would like to know more about. It sounds like it's a fairly new discipline. Where would people go to find information on this?
Alex Antic 25:55
Great question. There are limited, I think, resources available currently. I can dig some up for you that we can share with with listeners, and then we can promote that through the website. There are some examples of work being done by the big four, through their websites, and associated searches, keyword searches with them, and privacy enhancing technology. I think we can link to some common examples. Most Google searches, to be honest, of privacy enhancing technology, will come up with some leading examples. And some organisations that are doing research in this. It's quite a nascent field in many ways, but there are certain groups that are kind of specialising in it. So happy to share some links today, if people are interested.
Jonas Christensen 26:30
Yeah, that'd be brilliant. So we'll link to that in the show notes, for sure. Now, Alex, I want to move to another topic that we've touched on today, but haven't discussed in depth. So we're seeing this notion of data sharing and data portability, just becoming more and more important. And regulations are popping up around the world, actually, to some extent, even mandating that certain industries have to put their data for sharing. So in Australia, we have open banking. Open data is the further extension of that, that will come over time. And the Australian government has just announced a proposed extension to what we call the consumer data right, which is the right toward your consumer data between organisations, more or less. That's what it is about. Now, that'll also include general financial services. So, we have open banking already, but it'll also include things like insurance, and so on. And then also telecom, so your phone and internet. So that's quite a lot of personal and transactional data that you can start to put around different organisations and combine. What did these open data initiatives mean for our collective ability to build a better world with data, in your opinion?
Alex Antic 27:41
I think there's actually an exciting part of the broader push for the democratisation of data, and collected growing demand to control our own data, and the ability to give exclusive permission and consent on exactly how and for what purpose our data is used. I believe that sharing is caring. And by that, I mean, increased sharing of data is vital in helping provide better personalised and more efficient services. And also with increasing risk and for detection of mitigation, which definitely pushes forward the whole notion of data used for social good. I see that true social progress is based on the ability to safely and efficiently share data, including via emerging techniques like privacy enhancing technologies, which allow us to work within the current bounds of regulation and legislation. That's in the sense to either bring data together for analysis and usage or to disseminate data, as we talked about the case with differential privacy, for instance, so others can benefit from it. So overall initiatives such as this, I think, help us move forward in the right direction, and allow us to have a voice with how our data is used. So I'm very much all for them. And so it's fantastic that all these new initiatives are being released fairly regularly, both here and abroad. And it's great that it's not really being left behind in a way that we're quite proactive and moving forward with this.
Jonas Christensen 28:53
Yeah, I myself was pretty impressed with the Australian Government on this particular matter. And I think we're among the leaders in the world, in this space. So I think that's pretty exciting.
Alex Antic 29:03
Yes.
Jonas Christensen 29:04
Now, Alex, if you don't mind, I'd love to shift now to another topic. A related topic, but the different topic nonetheless. Because we've talked about your government background here and the things you've learned from there. You also have spent many years in academia. And that is, of course, a place where you still are and you're passionate about that, and educating the future leaders of analytics. So I'd love to talk to you about that. And when you and I started working in analytics, or data science, whatever you want to call it, there weren't really any of these sort of academic courses of full degrees that you could do today. So that's, of course changed a lot especially the last 5 to 10 years. Could you tell us about this evolution of data science education over this time period, and what you do today to train and develop the next generation of leaders in this space?
Alex Antic 29:56
It's actually been amazing to witness throughout my career that how the educational spacee has really changed. In particular, in relation to the education and training within the data analytic space. There are just so many options these days, especially if you want to learn at your own pace. And it's become so much easier to pick and choose what you want to learn. So options such as, you know: massive open online courses, private training, in house training that some organisations are developing, micro credentials and short courses. So many credible options for working professionals, for instance, who want to reskill and upskill in data analytics, which is fantastic to see. And some programmes I've helped launch and run during my time in academia, and still do these days throughout my work. One of the best parts of my job is really sharing my experience, expertise and insights through teaching with others. And the teaching part of my business really focuses on developing and delivering tailored teaching and training solutions for people I work with, by leveraging my industry, government and academic experience, broader knowledge of the field and my network of people that I can, you know, reach out to, you know, to collaborate with me to give guest lectures and in many other parts. This stuff includes developing technical training such as you know, teaching people how to leverage R, Python, SQL, SPARK to get up and running quickly. Discussing and talking about different applications of various technologies, and how they can benefit an organisation such as, you know, when would you use NLP or graph analytics techniques? When are they suitable and what type of benefit can you actually realise? The training also includes helping organisations develop from the ground up their data science, their capability, you know. Understanding: What type of people do I need? What cultural do I need? What type of infrastructure and tooling will I need? This also touches on C suite training on data literacy. Helping them understand what capabilities exists for them and their teams, what's realistic, how they can pivot what they're doing to really move towards a data driven organisation, and how to establish and lead data science and AI things. Which some of them really don't know, they don't realise that leading CIT or data science is quite different. You know, really to me comes down to this notion of discovery versus delivery. You know, data science needs to be scientific in many ways versus traditional IT. They need to work together in a symbiotic relationship. But there are key differences that I think people need to realise to really have success and scalability with those opportunities. Also, for career coaching people with a 1-on-1 or a team based, for those wanting to build and grow their careers, I also run executive workshops and retreats, which is always fun. Also through my affiliations with several leading universities, I help create courses, programmes, initiatives to train and educate future data analytics, professionals and leaders. Once again, by leveraging my industry experience and insights to ensure that there's relevance with what's being taught to what's actually being used in the industry and meet current and future needs. Looking at where there's maybe potential emerging technologies in nascent fields and where universities should start educating, you know, current students to make sure that they're ready for this. And a big part of that - some of the work I'm currently doing with RMIT - is really around ''How do we ensure that a lot of these technical professionals actually work-ready? These graduates are work ready?''. We want to make sure that when they hit the ground, running for an organisation, they're not just technically proficient, but they understand a broad range of skills and experiences that are needed to really, you know, add value, to be truly valuable employees. The things that were traditionally known as soft skills, which I think are absolutely integral to success as a data science professional. Things, like, you know, communication skills: How do you work as part of a team? How do you design Thinking Agile methodology? Things that they wouldn't necessarily taught in a technical STEM studies, but ones that with the right balance of work-integrated learning, make them really stand out compared to other graduates. I think that is the future of data science education and training. Where it's heading is to have kind of that notion of an apprenticeship base elements, where you don't just learn the theory, start a job and you're ready to go. You need to think of some integration with hands-on practice to really round out your studies. And I think that's kind of the future of the training and something that I'm really excited to be part of, and looking forward to helping develop in the coming year or two.
Jonas Christensen 33:51
So, you're talking here about some sort of apprentice model or a work experience model through universities at the same time. Is that it?
Alex Antic 33:59
Exactly like for instance, a one year course, which has credit attached to it. But that one year course is not so much learning theory, in a classroom. It's them actually working for an industry partner of the of University, where they're embedded in a team and they have real outcomes. They are effectively an employee for that fixed time period that could deliver value. They have support from mentors and RMIT Staff and academics. They have support from their peers, from other students going through the program. But ultimately, they're responsible for creating value. So they have to actually try and do their best in a supportive environment. Actually apply the skills that they've been learning by working in a real team rather than working on, you know - let's let's be honest - pretend problems that often you work on in spite of assessments at university. Which are fundamentally important, but they cannot replace the feeling and the emotion and the practicality and pragmatism attached with actually being in a team and having to deliver value. So students that I've seen come out from programs like this: they're just so far ahead of the others. It's incredible.
Jonas Christensen 34:57
Yeah, you're making me smile because one of the hardest things to teach at university would be: What do you do when the data is dirty, or it's not actually matching? What happens in real life and...
Alex Antic 35:09
Exactly.
Jonas Christensen 35:09
How do you deal with a call centre always choosing the wrong field from the drop down menu or one of those situations? That's very hard to teach when you teaching how to do a model. When the input is wrong, then the output will be wrong. We all know that. And I think this is a great development. I remember when early in my career, I worked for an organisation that had a partnership like this with the university. And every year we had students in for about four months. And I'd say half of them ended up getting a graduate position after their studies because they prove their worth as well when they were in those organisations. And they added a lot of value. They were sort of already up and running. Four months, it's a good step for someone to get started and better than their organisation. And they learned a lot and a lot faster at the same time. So this blend is really powerful. And that you've talked a bit here about what are some of the important data science skills for professionals entering the industry in the world we live in today. In the next five to ten years, what would you say are the important skills for data scientists to have both, from a technical and a non technical perspective?
Alex Antic 36:16
First of all, I think it's important for people who are aspiring data scientist or new to the field to understand that becoming a strong, really good data scientist is not easy. Much like you just don't walk into a gym, you know, pump some iron, take some protein and walk out looking like Dwayne Johnson. You can't simply learn R, Python coding, learn about some ML deep learning algorithms and walk into an organisation and really create impact immediately. You need to really spend time and effort learning, studying and applying yourself. I think, really, the key is to really focus on the transferable fundamental skills, the mathematics and statistics that underpins machine learning and deep learning, and all the techniques you'll be using. And some really strong coding skills, even if you're not going to be a software engineer, data engineer. If you are working as a data scientist, you know, you wiil be coding. So, understanding the key principles and strengths of different languages and the software paradigms that surround them. Such as you know: code reuse, code sharing, code commenting, source control. All those things are absolutely vital to becoming a really good data scientist, at least in this day and age, especially as the field becomes really repetitive in many areas. Beyond that, other skills that are absolutely paramount. Problem solving: being able to really - as you touched on earlier, you're given the dirty data set: How do you actually go about solving issues related to that, if you haven't been taught specifically about, you know, imputation, missing data, quality of data? How do you solve the broader problems? They won't always be technical. They could be, you know, people problems or problems around building relationships with others and getting others on board. I think in this space, you can't escape this notion of being able to creatively and practically solve problems. Being curious, I think is absolutely vital as well. Being able to ask questions, the right question. Often, it's about just exploring questions with, say, business partners you're working with, to better understand the business context, the data that you're using. That's something that I see come up all the time, where some people can get really stuck when they're early in their career. Working as part of a team: something that took me a while to, I guess, realise the importance of early in my career because I loved working often on my own on really complex tasks. That's something I kind of went throughout my PhD. But I quickly realised that as a team, you can create some fantastic value, and I found doing so. So I think becoming comfortable working collaboratively with others is really beneficial. I understood really, the importance of communication skills, both written and presentation. You know, sometimes you can spend a lot of time creating a fantastic model but if you can't present the benefits, both in a paper or in an actual presentation and giving to colleagues or a manager or a client, sometimes it can make or break how that model is actually used. So I think really, trying to finesse and work on your broader communication skills is imperative to having a successful career as a data scientist. I think it's also really important to surround yourself with good people and learn from them. Try and find roles where you won't be on your own and isolated. You will actually be working with some smart, talented people who are experienced and they can directly share information and mentor you. But also, a lot of that will rub off on you as you observe how they work and some of the challenges that they come up against and how they tackle them. And I also think it's important - through the mentoring I do nationwide, through my program- I think it's important for people to try and find mentors. I've had a couple of really key mentors. Early in my career, I found that incredibly beneficial. And now that I'm a mentor myself, and I've been doing it for quite a while, I've seen the benefits it had for my mentees, but also for myself in terms of helping me challenge things, especially if they asked me certain questions. Helping me try and better understand what are different, you know, solutions to various problems that they find in the workforce. So I think that's actually really important. One tip I'd like to give is, is to tell people ''Don't rush to implement algorithms by bypassing the science in data science'' and you know, focusing on the business problem you're trying to solve. I think it's important to always try and start with a simple solution first that you can understand, especially around interpreting outputs, and explainability. And then only add complexity, as is needed. Some people get excited, you know. They'll learn about neural networks and some cutting-edge, you know, deep learning models. They'll land a role, and all of a sudden, they're thrown an Excel spreadsheet instead of data, trying to make sense of it or solve some problem. And they'll try and then quickly build some amazing model. I think that's deviating too far. For what? On the purpose you are there: You're not there to create models, per se. You're there to solve a business problem, using your technical mouse and working with data analytics. So, I think the key is: Always have a very, very clear understanding of the business problem, of its context, of its alignment to the strategic goals. And make sure that the solutions you're developing are clearly aligned to that, in a quantitative sense, where you can measure your outcomes and measurable impacts that you're having. And that will always put you in good stead to build your career, climb the ladder, source funding, and just build credibility with your manager and your clients and customers more broadly.
Jonas Christensen 36:24
Brilliant advice, Alex. Thank you. So I can only subscribe to what you're saying. And I'd say that most of these stakeholders won't really care about your model. But they will definitely care about the outcome of it. That is what you need to communicate.
Alex Antic 41:24
You need to be able to explain that and speak in their language. Not in technical jargon. But in business terms. Which I think is you're just touching on. So, definitely.
Jonas Christensen 41:32
That's right. Speak in business terms, and make sure the model doesn't do something you don't want it to do. So. that's also your job. So, do cover after the technical. It's still important.
Hi, there, dear listener. I just want to quickly let you know that I have recently published a book with six other authors, called ''Demystifying AI for the Enterprise. A playbook for Digital Transformation''. If you'd like to learn more about the book, then head over to www.leadersofanalytics.com/ai. Now back to the show.
Now, Alex, we talk about here, what's good for the individual. What are some really important data science skills, AI skills for us to foster as a society? What do we need around us in society to make AI do Good?
Alex Antic 42:23
Great question. So, one of the key points that I touched on earlier is this broader training, I think, of individuals. Maybe not formal training. But understanding of data literacy. Understanding what data is, what it isn't. How it's actually used to help with decision-making. I think that's absolutely critical. But beyond that, it's really logical and critical thinking. Both for general public, like, in a much more - like, we've had historical reading, writing, computer literacy developments and revolutions, etc. I think, you know, now we're at the data one. Data literacy is a big one. But it's also important for senior leaders and government officials who use data and analytics each day to make critical decisions to really understand, at least some of the basics of of what data is, how it's used, how it's created, stored, analysed. Just some basic concepts. So they can then ask the right questions and really push their data scientists and leaders that sit below them to make sure that there's a lot of, I guess, credibility, and trust in in what they're doing. Overall, I've been seeing a lot of growing demand for data engineering skills and those particular abilities. Many organisations are pushing to productionalize and scale their data science models, but I see a lot of them become a bit too eager to production all those models before putting an effort to first develop the right model in the first place. By not focusing on the science and data science, I think anyone can develop quite easily these days a deep learning model with a few clicks, a few lines of code. But does the model actually solve a business problem? Could a simpler model be used as a model robust to variable data quality? I think these are the critical questions that need to be answered. Certain people in power to senior levels that are making decisions, maybe using these models on a daily basis. Having them be a bit more data literate, and understanding some of the technical elements of data science will be really helpful. We often put too much emphasis, I think, on the models themselves, such as chasing slightly high accuracy rather than focusing on data quality, and its suitability to the problem at hand. And also without giving due consideration to sufficient and consistent, say, labelled data for supervised machine learning tasks. I've also been noticing as I speak to people about - as they grow in the data science capabilities, and they build up the skills within organisations - much more need for high level data science expertise. People with really strong technical and business skills are ones who are more rounded with quantitative and scientific skills in particular, such as backgrounds in the traditional quant areas like math, stats, physics, econometrics, etc. However, some of these leaders don't understand is that not all data scientists are created equal. As data science and AI becomes more commoditized, I think the industry will get more competitive and those with strong quantitative skills with a lot of scientific nows and ability will release stand out. So I encourage people to really build up those particular skills if they want to become data scientists and really become valuable to the organisation. But I'm also seeing, beyond that growing demand for senior executives with data science and quant skills and expertise, are people who are into Chief Data Analytics roles, who have some technical background in that area - I come from another area, which is quite devoid of any concentrating - because they're trying to make decisions based on data analytics. So I think it's really important to have people that have some basic understanding, at least, if not hands-on training, so they can ask the right questions and further spur innovation and change in their organisations.
Jonas Christensen 45:39
Yeah, brilliant lists. So there's a lot there for people to take away. And I think I said to you before we started the recording that data science is one of the hardest fields to master at the moment, because there is such a depth and such a breadth of skills required to become that data science unicorn that you just described. If you can do all of those things. Because it requires the technical knowledge and ability to produce models that are highly complex and difficult to interpret by their nature but also the linking-in with the business and understanding business operations and other sort of numeric, important numeric topics in a business like finance, and so on is also important for you to understand when you build these things. Because typically, or often it's about improving financial outcomes for you and youe for client. And on top of that, it's almost a pseudo software design that we were doing. So, some of those software design principles need to be included. And you talked about it yourself in terms of human centred design, and so on. But there's just a lot there. It's huge. And one other topic that we also need to become really strong at is ethics, in my opinion, AI ethics. So the students that we're training today - that you're training - are going to have a huge influence on the privacy and ethical implications of all these AI solutions that are going to come to bear in the next 20 years. So what are we doing, collectively doing to make sure our future data leaders well-equipped to make these ethical design choices?
Alex Antic 47:20
This really goes to the central ethos of my business, which is centred around this notion of human centred data science and AI. Where technology is a means to an end, as I touched on the beginning. For me, data science and AI is really about people. It has a direct impact on people's lives. And I think we can never forget that, when we're developing models, coming up with solutions, productizing these systems's moral and ethical element. I think should be at the heart of it. So, developing within future practitioners and leaders, training around logic and critical Thinking, I think is absolutely important. You want them questioning things, making, you know, scientific and logical judgments, making sure that they understand fundamentals, as we touched on, around data, math stats, you know, all those are absolutely critical. You want them to be able to question findings. So having a scientific mentality, things around hypothesis testing and rigorous testing techniques, reproducibility. All that is absolutely absolutely important. Formal ethical training and awareness of responsible AI, I think, needs to explored further in a more systematic way to teaching - maybe not just at university, but even at earlier use - along with logical and critical thinking. Because those concepts form the basis of all the data driven work being done now and in the future. I think also an understanding of explainable AI, why it's important, and you know, what explainability means and association with different systems. It's really important. That's something we can't really, I think, turn a blind eye to. And then what we talked about before this, kind of apprenticeship notion, where we integrate industry training as part of degrees in helping bridge the gap between academic theory and the reality of the practice of data analytics. I think is really, really important, also pivotal to the success of our training the next generation. Broader than that, we also have to really ensure that we are increasing diversity and inclusion of our future AI professionals and leaders. We need to have a representative sample of the human population by age, gender, race, disability class, etc. Having just middle aged white male geeks developing these models that define our lives just isn't going to work. So there's a lot more than I could probably talk about in relation to ethics and automated decision-making. Happy to ramble on about that, if you'd like but I think those are really the key. But having people that can develop these systems, and most importantly understand bias, you know, and also ethics and how bias plays in. And there are many recent examples that have come to mind such as: In 2019, when Apple was investigated over its supposed sexist credit card, and used an algorithm that appeared to be biassed against women. I think trying to eradicate issues like that, especially when they're scaled up is really important. So sufficient training around bias, fairness, ethics, explainability, responsible use is really important. And more important than I think many, many people realise.
Jonas Christensen 50:05
It is. And there is so much bias hiding in data that is not only hard to identify, but it is a huge risk because we are automating all that bias, unless we somehow design a way out of it. It's such a critical thing. And it's easy to say AI but the humans put it in there. But that's kind of the problem: If you're automating that, then you don't have humans taking it out again. And that's actually what we normally do when we see a wrong choice. We have a chance to course-correct. But if if models are automating a lot of our decisions, then that step is removed altogether. Now, Alex, before we get to closing remarks, is there anything else that you would like to say on this topic? Or do you feel like we have covered all grounds?
Alex Antic 50:53
I guess, just a few more words around the bias, given, I think, it's a very important topic. And you're right. Bias can creep in through various data sets. But I think it's also important for people to realise just how much of it comes inherently through the people that are developing the algorithms themselves. There's an inherent human bias there. And that can be a real problem. So there are numerous points in the process in which bias can creep in. So it's not simply a technical problem to be solved. I think that's really one thing I want to make. But you know, sometimes people ask me ''But if AI merely reflects systemic human bias, then why is it such a concern?''. I think there's really two important points here. One is what we touched on: Scalability and models can be far reaching, can reinforce and perpetuate bias. But also, they also have the potential to allow us to hide from our moral obligation to justify moral judgments. You know, a system can have a huge effect on someone's life. And then the CEO can turn around and say ''Oh, the model told me to do it. So and I'm not going to take responsibility. It wasn't a human making a decision''. So I think that's something that's important be front of mind in these discussions. And while we can't completely eliminate bias, we can at least work towards understanding, identifying and reducing it in AI systems. And part of that was once again around having a good understanding of of how it manifests itself in data, but also around diversity, making sure we have diverse teams, technical, legal risk, governance, gender, age, race, etc. And they all weigh in. And it's also important for us to clearly understand and standardise if we can. What do we mean by fairness, bias, orality, privacy, transparency, explainability? So the future really lies in humans and machines working together to advance society. So I think a way to do this is to try and build socially aware systems, such as by encouraging ethical principles directly into the design of these algorithms. And then, above all that really having appropriate regulation and governance, because if we expect industries and organisations to self regulate, especially when monopolies are involved, it simply won't.
Jonas Christensen 52:45
No, And even if they do, it's often after the damage is done. So a lot of the things we've seen in the last few years is - I sort of classify as the data equivalent of an oil or a nuclear accident, where stuff gets out that shouldn't have and then ''Oops, we didn't put up the right guardrails. We'll do it now''. But it's a little bit too late. So we want to avoid that before it happens. Thank you for those last comments that was really useful for me and for listeners. Now, Alex, we're getting to closing remarks. And one of my last questions is: I want to ask you to pay it forward. And tell us who you would like to see as the next guest on Leaders of Analytics and why.
Alex Antic 53:25
I think Nonna Milmeister, who is the Chief Data and Analytics officer at RMIT, and a colleague of mine will be will be perfect. She's a fantastic leader who's pioneering various University leading initiatives, trying to educate the next generation of data science, male professionals and leaders. Some of which we've talked about. And I think she could share a wealth of experience throughout her very impressive career with your listeners, so I highly recommend you reach out to her and have a wonderful chat.
Jonas Christensen 53:52
That is a brilliant recommendation. And Nonna is also an interesting person with an interesting background. I've heard her story before. So that is definitely one that should be told in the show. So thank you for that recommendation. Now, lastly, Alex, where can people find out more about you and get a hold of your content?
Alex Antic 54:10
I'm happy for people to follow me and contact me or reach out through LinkedIn: Dr. Alex Antic. Visit my website: dralexantic.com, which also links to my blog, impartiallyderivative.com. So more than happy to have a chat with people and to see if I can help or leverage my network to help them in their endeavours. Yeah, always happy to chat about anything to do with analytics data, and AI. Real passion of mine and I'm constantly talking about.
Jonas Christensen 54:35
That is a wonderful invite to all you listeners out there. So don't hold back, get in touch with Alex. Alex Antic, thank you so much for being on Leader of Analytics today. I have thoroughly enjoyed the conversation and learned a lot about lots of things that you have accumulated across your very impressive career. All the best for the future, and we'll see you soon.
Alex Antic 54:58
Thank you. It's an absolute pleasure, Jonas