Archive for May 29th, 2018

  • Benefitting from data lakes

    Jessie Rudd, Technical Business Analyst at PBT Group

    Previously, I discussed some of the defining characteristics of a data lake and contextualised it in a business environment. In this article, the focus shifts to its importance for South African companies trying to remain relevant in an increasingly digital age.

    One of its most significant elements is the raw data formats it contains. Instead of accessing preconfigured data sets (as with a data warehouse), organisations have access to a more free-flowing, natural environment. This means data scientists can find and access whatever they need faster and more effectively than if they had to go through the more rigid traditional process.

    ‘Drowning’ in data

    However, this fluid approach can be quite intimidating especially given the sheer amount of data (both structured and unstructured) that is available to businesses. There is a risk of getting lost in the data and consequently not retrieving the intended information needed to drive decision-making.

    Due to its nature, the temptation is to use the data lake as a repository for everything. And, while it might seem contradictory, keeping the data lake of a company organised so it stays useful and relevant should be a priority.

    Enter the need for the data ponds I mentioned in the previous article. These are self-contained pieces of the same kind of data that are easily searchable and manageable. Instead of wading through all the data at the organisation’s disposal, data scientists can perform targeted searches in relevant pockets of data.

    Given how relatively inexpensive storage (cloud or otherwise) has become and the fact that data can be stored there indefinitely, the business can extract information in real-time whenever required. This can significantly aid decision-making by having the ability to factor in the latest customer data and competitive trends.

    Architecture management

    Where it becomes expensive is how to approach the data lake platform used and the extent at which it is integrated into all existing operations inside the organisation. In the South African context, where big data is still being positioned as a business differentiator, data lakes are still going to be a hard sell for some time.

    Despite this, the ability it provides to analyse data that was previously inaccessible (think social media posts and other digital communications) and develop better defined bespoke customer solutions, can significantly improve the business bottom-line.

    Data lakes do provide advantages, yet there needs to be a change of approach from the business. It is no longer a matter of how and when to access data but rather using it in real-time to improve agility and business readiness for the digital world.

    Organisations are starting to understand the importance of more effectively analysing and understanding data. Once they start embracing data lakes, the momentum will shift for the establishment of data-rich businesses that use their understanding of market needs to create product and service differentiation.

  • A practical approach to introducing AI into corporate environments

    This article was published on Biztech Africa, on 23rd May 2018

    Source: Biztech Africa

    http://www.biztechafrica.com/article/practical-approach-introducing-ai-corporate-enviro/13583/

  • Security should underpin a cloud strategy

    This article was first published on Intelligent CIO, on the 18th May 2018

    Source: Intelligent CIO

    http://www.intelligentcio.com/africa/2018/05/18/pbt-group-expert-security-should-underpin-a-cloud-strategy/

     

  • Understanding data lakes

    Jessie Rudd, Technical Business Analyst at PBT Group

    Even though data warehouses and data lakes are considered large [data] storage repositories, this is where the similarities stop. While the latter offers significant business opportunities, not many organisations understand how to effectively unlock its potential.

    A data lake is unstructured data that comes direct from the source. The data structures and requirements are not defined in any way until the data is needed. By its nature, it is exceptionally agile and provides data scientists with a platform to extract meaningful insights.

    In South Africa, the only companies that have been able to benefit (to a limited extent) from data lakes have been those operating in the telecommunications sector. This is purely based on the amount of data they have at their disposal and their budgets to acquire the resources needed to store it.

    However, whether they choose to utilise it or not, remains to be seen. After all, South Africa is still quite traditional in how it approaches data management. So, while telecoms operators understand what data lakes are, their willingness to access it effectively is still up for debate.

    Data ownership

    Complicating matters is that data analysts are organised and prefer to work in a structured environment whereas data lakes require a more scientific approach based on curiosity. The mindset needed to really do the proverbial wading in to a data lake is quite different to how many organisations and analysts view data currently.

    It does not help that many organisations are still up in the air about the benefits of big data. There is still an ongoing debate on the merits of structured versus unstructured data and what the practicalities are for the daily running of a business. Unless a company has a dedicated group of people delving into big data (or even a data lake for that matter), there is not enough leeway to really get the benefits associated to it.

    Swimming in the lake

    A data lake does offer companies a powerful platform to do a lot of things with data, but it does require a leap of faith in how to access it. If you do not know what you are looking for in the data lake, you are never going to find it. Companies have limited budgets. And in a tough economy, they want to remain focused. For smaller businesses, lakes are simply not a viable option even though it is built on the unstructured data they are leveraging.

    An option would be to divide the lake into smaller data ponds where each kind of data is pooled together. This means scientists or analysts can go to the right pond looking for specific data. So, even though the data is still a complete mess, at least it will be of the right kind.

    Whether the telecoms operators are wiling to experiment with this and play around with data lakes will drive a lot of the growth potential for the immediate future. But it must be focused on getting the basics right and create a platform from there.

  • The big data exploration: part three

    This article was first published on ITWeb, on 09th May 2018

    Source: ITWeb

    https://www.itweb.co.za/content/JBwEr7n5ANEv6Db2