Companies should embrace, rather than avoid, open source software
For organisations to remain agile and responsive, it’s a business imperative to embrace open source software, but meeting regulatory requirements remains a challenge.
We’re increasingly seeing analytical models being developed in open source, and it’s easy to understand why. If as an organisation you’re hoping to perform as an agile company, developing in open source is attractive, not least because an open source environment supports rapid and agile development of projects and models.
The skills we see emerging from universities into our industry and professional environment, are supporting this trend, with a marked upturn in graduates with the skills for developing in open source software, using widely available programmes like Python and R.
It’s rapidly becoming a tired cliché, but one thing remains true – data is the new oil. Open source provides an open space to tackle new challenges, to explore the data and see what answers it contains. New projects can enable significant successes with fast deployment, and importantly, it also supports the ‘failing fast’ strategy without incurring significant development and infrastructure costs. We also see tremendous enthusiasm for open source software in companies where innovation is a top priority.
The problem many companies encounter is when the open source model needs to be taken into production and scaled. While companies are doing a lot of development in open source playing with concepts and products, putting it into production still presents a challenge.
Evolving companies’ analytics platforms and operationalising analytics is where the challenge lies.
The hybrid approach of combining open source with proprietary software can deliver the best of both worlds, because proprietary software can address the challenges of taking a project into production and moving to scale for enterprise-wide use.
Governance is key, particularly in South Africa, where the regulatory environment is becoming tighter and more challenging. It’s important to be able to see right through your entire data lineage, to know how the data has changed and how AI inside the model has impacted the results and outputs.
When the regulator asks those questions, organisations must be in a position to answer – regulators expect companies to provide the appropriate levels of traceability and auditability. This is particularly true of organisations operating in the financial sector, where regulatory requirements are extremely demanding.
A core principle is digital guardianship, i.e. providing mechanisms to meet both expectations of the people entrusting you with their data and the internal organisational moral compasses for protecting agreed upon use of the data.
Crucially, controls are also necessary to provide trust in the data. Businesses have to trust that the results of models are accurate and that those models will continue to perform into the future. Transparency, governance and security are all essential components, and they become even more critical when organisations scale their efforts.
I believe that the hybrid approach of combining both open source and proprietary software is the most effective way to combine rapid development with deployment at scale and reliability in model management.
Robust end-to-end data science platforms benefit any company working with data at scale. It’s imperative to work with products that allow data scientists to write code and build models in any analytic programming language of their choice, but deploy and run them in a controlled, enterprise-grade environment with full end-to-end traceability.