• Columbia Engineering Boot Camps
  • Data Analytics

15 Must-Watch Data Analytics TED Talks

speech on big data analytics

This article is part of a series. To discover more inspiring TED Talks covering a variety of industries and technologies, be sure to check out our round ups for Coding , Cybersecurity , or FinTech .

Today, data forms the foundation of our stories, and data analysts are the storytellers. Consider that by 2025, according to the World Economic Forum , the world will generate 463 exabytes of data per day. An exabyte is 1 quintillion bytes, or a 1 with 18 zeroes behind it.

We are swimming and swirling in a sea of data, but how do we make sense of it all? That’s what data analysts help us do. Data analysts collect and collate a variety of data points, turning them into stories that can produce actionable results. They impact every government, regulatory, commercial, and educational organization and help leaders make better decisions.

Listening to data analysts discuss their jobs, their dreams, and their concerns offers a unique window into how we can better use data in our lives and careers. A good forum for this is TED, a nonprofit organization that enlists experts around the world to share stories and insights. TED Talks are thought-provoking and short (usually under 15 minutes each).

To learn more about data and data analysis, check out these 15 must-watch TED Talks.

1. A Data Analyst’s Journey

Anna Leach , a Ph.D. student at the University of Arizona, says data analysis is something everyone should embrace — even if they don’t study it. Data analysis covers more than inputting numbers into an Excel spreadsheet and collecting the results. It requires investing time with people and the process of analyzing data, asking questions, and understanding that data can have biases.

“Data analytics is as much an art as it is a science,” Leach says. “Anyone with any background can be curious and investigate information to tell a complete story.”

2. Demystifying Data Science

Data science is a big, scary field that most people don’t understand — or at least, that’s the perception. But Asitang Mishra , a scientific application scientist at the NASA Jet Propulsion Laboratory, seeks to explain this fast-growing, often opaque field in easily relatable terms. For example, explaining his work to an Uber driver can be a challenge, and in this TED Talk, Mishra establishes a thoughtful baseline for data science and data analytics.

“We convert a human problem into a computer problem,” he says. “We solve human problems using principles of mathematics. But it’s important we communicate our ideas in a language that is not necessarily mathematics, that is more human.”

3. We’re All Data Analysts

Don’t believe we’re all data analysts? Listen to Rebecca Nugent , who explains the concept with some relatable, funny examples. Nugent, a professor of statistics and data science at Carnegie Mellon University, illustrates how we are all data analysts by performing simple daily activities like crossing the street.

Nugent also untangles the curious way in which some people are embarrassed to be called “illiterate” but proudly identify as “innumerate.” For example, she presents a 1992 Barbie doll ad that says, “Math class is tough,” then posits, “Can you imagine Mattel creating a Barbie that giggled as she said, ‘I can’t read’?”

4. Why Everyone Should Be Data Literate

You’ve probably heard the phrase, “data is the new oil,” but what does it mean? Jordan Morrow , who delivers frequent talks advocating for data literacy, explains: “Data is this valuable asset, but just like oil, it has to go through people and refinement to get value.”

In this TED Talk, Morrow covers how data literacy is important by using two everyday examples: deciphering truth in social media and buying a refrigerator. And, while not everyone is a data scientist, he thinks everyone should be comfortable with data.

5. Data and Your Health

For your next doctor’s visit, come prepared with data. That’s the recommendation from Talithia Williams , Ph.D. and Associate Professor of Mathematics at Harvey Mudd College. Williams supports this recommendation with a story about her pregnancy, during which she used data she compiled to challenge a suggestion from her doctor.

“By taking ownership of your data … just by taking these daily measurements about yourself, you become the expert on your body,” Williams says. “You become the authority. It’s not hard to do.”

6. What to Do With All This Data?

Counting things is easy. Understanding what they mean after they’re counted is far more difficult. Susan Etlinger , an industry analyst, urges us as consumers, coders, and analysts to think critically about the data we receive and try to interpret.

Data goes beyond numbers — it includes images, text, audio, and video. All the disparate types of data we have created require people to think more contextually about what it means and how we use it. “We are not passive consumers of data and technology,” she says. “We shape the role it plays in our lives and the way we make meaning from it. But to do that, we have to pay as much attention to how we think as how we code.”

7. Photographing the World Through Data

Though a bit dated (2014 is a data lifetime ago), Dan Berkenstock’s TED Talk is a fascinating combination of data and passion. Once a data scientist who chased nuclear tech smugglers, Berkenstock moved into satellite imagery, building satellites that cost a fraction of what their predecessors had.

Berkenstock explains the uses beyond Google Mapping your own house. “We see ourselves as pioneers of a new frontier, and beyond economic data, unlocking the human story, moment by moment,” he says. “For a data scientist that just happened to go to space camp as a kid, it just doesn’t get much better than that.”

8. Skills You Need When Working With Data

Collecting data, inputting it into Excel spreadsheets, and writing algorithms aren’t the only skills required of data analysts and scientists. Jose Miguel Cansado argues that those who work with data should also have an artistic mindset.

Cansado, VP of sales at an intelligence company, says data prevents crime, predicts political revolutions, and creates fine art. Data has enhanced our lives, but we still need to know how to use it properly. “It’s a paradigm shift,” Cansado says. “Where we made decisions based on intuition and guesswork, now we can manage based on evidence and we can move based on data-driven decisions.”

9. Influencing Decisions that Matter

Prukalpa Sankar tells the story of data-based decision-making through the lens of two 2014 events: that year, Germany leveraged data analytics to win the World Cup football tournament, and Myanmar had to halt its census because it ran out of pencils.

“It’s crazy that big data is used to solve some kinds of problems and not others,” says Sankar, who has founded two companies that seek to democratize data. Sankar envisions a world in which data can predict traffic patterns or determine if and when children might drop out of school before they even know it.

10. How Should We Use Data to Make Decisions?

Christina Orphanidou assesses the ways people should use data to make decisions. Individually, we make many decisions largely on impulse; our moods and biases inform those decisions.

But Orphanidou, Senior Manager of the Data and Artificial Intelligence Lab at PricewaterhouseCoopers (PwC), says people are beginning to process decisions based on data, just like large corporations. “We will move away from impulse and intuition and toward decisions based on data and evidence,” she says. “Our partners in making decisions will be intelligent machines.”

11. Small Farms and Big Data

Erin Baumgartner worked at MIT for 11 years before starting a food delivery service called Family Dinner . Baumgartner used her experience in data analytics to build menus, cut food waste, and help small farmers be more competitive.

Baumgartner says the food industry is “broken” and she’s out to change it by using analytics to create a community around local food. “I believe that the story of local food needs to be understood, told, and elevated. And in many ways, I think that nerds like us are really uniquely poised to tell it.”

12. Tech Companies, Data, and Our Children

Did you know that tech companies know about your children before they’re born? By conducting a web search for “ways to get pregnant”, downloading tracking apps, or posting ultrasound photos, parents give companies all sorts of data about their unborn children. And that’s just the beginning.

Veronica Barassi , author of the book Child Data Citizen , explains to parents why this data about their children matters. “All of these technologies transform the baby’s most intimate behavioral and health data into profit by sharing it with others,” she says.

13. The Human Insights Missing From Big Data

In the 2000s, a global digital-products manufacturer watched business sharply decline because of a decision based on in-house analytics. This company’s data said people were not interested in buying smartphones.

Tricia Wan g, a data ethnographer, says that the company wasn’t analyzing the right data. Wang proceeds to delineate the difference between “big data” and “thick data,” which contains human stories, emotions, and interactions that can’t be quantified as easily. She also outlines the importance of combining these data forms into a complete model.

14. What Data Analytics Can Teach Successful Organizations About Success

Is it possible to outthink the competition and not outspend them? Rasmus Ankersen makes his argument through the lens of European football, suggesting that many organizations can benefit.

Ankersen, an author and entrepreneur who lends his analytics expertise to football teams in England and Denmark, further explores how a sports gambler became so successful that he bought two European teams — and ran those teams using the analytic models that made him a successful gambler.

15. Crime and Data

As attorney general of New Jersey, Anne Milgram employed data analytics that helped lower crime in Camden. Later, she worked at a foundation that built a data analytics tool that enables judges to assess the risk of jailing or freeing people who have been arrested.

Milgram says that data is among the most important forces in public safety. She also references how a Major League Baseball team used data analytics to transform itself, which was detailed in the book Moneyball .

“I wanted to introduce data and analytics and rigorous statistical analysis into our work,” says Milgram. “In short, I wanted to Moneyball criminal justice. It worked for the Oakland A’s, and it worked for the state of New Jersey.”

Get Boot Camp Info

Step 1 of 6

Our use of cookies

We use necessary cookies to make our site work (for example, to manage your session). We’d also like to use some non-essential cookies (including third-party cookies) to help us improve the site. By clicking ‘Accept recommended settings’ on this banner, you accept our use of optional cookies.

Necessary cookies

Necessary cookies enable core functionality on our website such as security, network management, and accessibility. You may disable these by changing your browser settings, but this may affect how the website functions.

Analytics cookies

We use analytics cookies so we can keep track of the number of visitors to various parts of the site and understand how our website is used. For more information on how these cookies work please see our Cookie policy .

A weathervane for a changing world: refreshing our data and analytics strategy − speech by James Benford

Good morning. It is a pleasure to be here to speak to you today.

I will talk today on the steps underway at the Bank of England, between our data and technology areas, to update our data and analytics strategy.

It’s important for two reasons.

First, the way we collect and use data has implications far beyond the organisation itself.

The Bank sets the main interest rate in the economy to meet our inflation target of 2%. We produce and circulate banknotes and oversee the smooth running of payment system. We regulate UK banks, building societies, insurers and large investment firms to make sure they are being run in a safe and sound way. We monitor risk across the UK financial system as a whole, taking action to mitigate it where needed. We can support the financial system by lending to it when and where needed. Where firms run into difficulties, we ensure this doesn’t cause problems for protected depositors, for taxpayers or the wider economy.

Across all these areas, data and analysis drive policy decisions the Bank makes and how we communicate them.

Further, data we collect and how we collect it affects over 2000 financial institutions that report to us and contributes to over 37½ thousand published data series which are core to UK and global financial statistics.

Second, I hope that sharing our experience will be of use to others. Plotting a course on data for complex, long-standing organisations is not straightforward, particularly during periods of rapid technological change.

The Bank and its history with data

I will start by saying a bit about the Bank’s history with data.

That context explains where we are today and shapes the priorities for the next stage.

Data are the lifeblood of the Bank and analytics the beating heart behind our decisions. We cannot take effective decisions to discharge our functions without detailed information and expert analysis about the economy and financial system. It has always been thus, right back to when the Bank was founded as a private bank in 1694.

There are many historical examples, but my favourite goes back to 1805 when a wind dial was put up in the Court room in the Bank, effectively our boardroom, linked to a vane on the roof. Back then, the weather was an important regular influence on the economy and that dial drove the Bank’s decisions on monetary policy. When the wind came in from the east, ships would be sailing up the Thames to unload their goods. The Bank would then put more money in circulation, so traders could purchase goods as they were unloaded. If the wind came from the west, the Bank would pull back excess money, to stop too much money chasing fewer goods, tempering inflation.

Over two hundred years later, we talk of real-time dashboards driving decisions in the board room. The technology is different and there are a lot more data, but the intent is identical.

Production of statistics by the Bank can be traced back to 1851, when the Cashier’s office was collating things like the average price of wheat and metals; gold and silver bullion holdings and exchanges; and, of course, interest rates. These were brought together in a book of ‘Periodical Fluctuations’, first compiled in 1875. By 1921, the first department at the Bank dealing with economic and financial statistics was formed producing estimates for the economy as a whole, including for the balance of payments. In 1960, we began to publish the Bank of England Quarterly Bulletin, with an economic commentary, articles and a statistical annex. Monetary and financial statistics moved online in 2000.

Collections to support how we regulate firms in the financial sector have a shorter, though more complex history. The Bank’s supervisory powers came with the Banking Act in 1979 in the aftermath of the secondary banking crisis. Collections later took on an international imperative following the first Basel Accord in 1991. The third iteration on Basel regulations, following the 2008 global financial crisis gave us, initially under an European purview, the system of some 200 regular regulatory returns we have today. The move of supervision from the Bank to the financial services authority in 1997, and its return to the Bank in 2013 means there is a complex reporting system in the UK where some collections come to the Bank from the Financial Conduct Authority, and some come direct. The UK’s departure from the European Union has given us a valuable opportunity to review the approach and rationalise what is collected.

Data in use in the Bank today

Given the theme of this conference, I thought I would say a bit about some fairly unique big data sets the Bank has access to from the data we collect and the data we generate from our own operations. Around 40 are managed through an on-premise data and analytics platform which offers additional computing power to crunch the numbers. I’ll give you four examples.

The largest big dataset arrives from trade repositories, who on a daily basis send us around 100 million rows of data on individual derivatives transactions and positions and around 15 million rows of data on securities financing transactions and positions. Our financial stability area has built up a range of tools to interrogate this data set every day and to monitor and flag the build-up of risk in the system.

Second, we also have a large household-level dataset encompassing quarterly reporting at the loan level on the 9.5 million mortgages in the UK, footnote [1] which has been in vital to tracking and understanding the impact of recent increases in Bank Rate.

Third, to support our analysis of the health of UK companies, we draw on a large company level dataset covering the balance sheet and profit and loss for 1,500 footnote [2] firms in the UK. It was used to understand pressures on balance sheets through the pandemic with those insight going not just to our Monetary and Financial Policy Committees but also to HM Treasury to help them design and run various corporate loan schemes. We also run a monthly Decision Maker Panel survey of more than 2000 firms footnote [3] up and down the country.

Fourth, all electronic payments in the UK run through the real time gross settlement system or RTGS, which the Bank runs. RTGS processes over £775 bn footnote [4] of payments each working day providing the Bank with a unique window on the economy. As well as putting it to use internally with the MPC, we made cuts of the data available to the ONS during the pandemic to help them track what was happening to spending in the economy.

Updating the Bank’s Data and Analytics Strategy

Building a data platform powerful enough to assemble and interrogate those big datasets was one cornerstone of the Bank’s previous data and analytics strategy. The strategy focused on enablement and established a broader data community across the Bank and equipped our analysts with modern analytical tools. One in four people at the Bank are now a member of our data community and one in five are cutting code regularly in R or Python to carry out their analytical work. Six in ten are interrogating data through one of our new interactive dashboards. Progress embedding these tools across the business has been accelerated by a central analytics enablement hub, which has now partnered with teams across the business on over 30 projects to modernise regular analytical processes.

While our previous strategy succeeded in driving the adoption of a modern set of tools and techniques in some parts of the Bank, progress was uneven and hampered by a number of barriers. The Bank’s Court of Directors commissioned our Independent Evaluation Office (IEO), to take stock on the Bank’s approach and make a set of recommendations on where to go next. That review started last year and initiated the process to refresh and update our approach. Let me walk you through the seven steps underway.

Step 1: Independent Review

Having an independent unit reviewing what we do at the Bank is very powerful; it helps increase public trust in us and improve our openness, and these evaluations have led to positive changes in how we work at the Bank.

The IEO’s over-arching question was ‘Is decision-making to support the Bank’s policy objectives informed by the best available data, analysis and intelligence, and can it expect to be so in the future?’

Its conclusions were released last October 2023. It recognised both the big strides the Bank had made in recent years and the distance still to travel. There has been a nervousness in adopting new technologies, notably cloud solutions, and changing established ways of working. There was a tendency also towards siloed ways of working and localised solutions.

The report set out ten detailed recommendations, under three key themes. The first called for the Bank to agree a clear, updated vision for data and analytics, supported by a comprehensive strategy and governance. The second recommends a breaking down of institutional, cultural and technological barriers. The third encouraged us to broaden and deepen our efforts to equip staff with the necessary tools and skills to work effectively with data.

The fact that there is plenty still to do is no surprise. The Bank is a complex institution with a long history where almost all areas work with data and analytics in some way. Its functions have evolved organically overtime reflecting the changing nature of its responsibilities. All strategies need themselves to evolve reflecting the changing challenges as well as the opportunities that come with new technologies.

Step 2: Define a governance structure to shape a collective response

Responding to the recommendations following from a review is itself an opportunity to form and embed a collective ownership.

Before we got started revising a strategy, our second step was to strengthen the Bank’s governance arrangements and form at Executive level, a new Data and Analytics Board, co-chaired between the Chief Data Officer, myself, and Chief Information Officer, Nathan Monk. The Board reports to Court and has responsibility for the agreeing, maintaining and taking forward the Bank’s Data and Analytics strategy. It has on it a Director representing each area of the Bank, as well as key areas in the centre like People and Change.

It brings together a federated system of local data boards. Each agree priorities for areas of the business, placing their own call on the overall strategy, and standing up projects to meet local need. Creating that system has catalysed all parts of the Bank to be clear and specific around their priorities for improving the way they use data to support their business objectives and is helping tighten up responsibilities around critical analytical process and the data sets that feed them.

We have set up a D&A Business Partners function in the centre to work with corresponding functions in technology and in people to glue the pieces together. They provide a Bank-wide service, identify opportunities to collaboratively solve data and analytics issues that span multiple business areas, as well as facilitating their more efficient resolution. An initial focus of the partners has been helping all local areas to identify their priorities and the initiatives to take them forward.

Following a recommendation in the review, we also set up an expert Technology and Data Advisory Panel to provide continual advice and challenge on the Bank’s approach, including by keeping us in step with the very latest technologies.

Step 3: Define medium term strategic goals

In the third step, we drew together from the business aims their aims for driving decisions with data and then worked up priority areas of change.

In the Bank’s policy areas, the common aim is to put interactive dashboards in the hands of those on our policy Committees or the staff who present to them. These dashboards allow colleagues to interrogate data and the outputs of models much more easily. Doing so can cut through the iterative and often time consuming processes of decision-makers tasking questions on the data to analytical teams, awaiting a written report and commissioning another with follow ups.

The Bank’s markets and banking area is looking both to use real-time data to inform live operational decisions and to harvest data from operations to shine a light on the economy and financial system.

We are looking to broaden our use of management information to inform our corporate policies, building on the success of a scorecard to track progress on diversity outcomes.

On the Bank’s data collections, we are seeking to get the right data in the building, at the right quality, at the lowest achievable cost, including to reporters.

The big areas we are seeking to change across the organisation as a whole are grouped into 4 missions. Stronger data governance and management to make it easier to find, access and connect the data they need. Work with UK and international organisations to share data and drive adoption of data standards and best practices. A new cloud platform to modernise how we analyse data and inform decisions. Applications of innovative technologies such as artificial intelligence that build on it.

Underpinning this is a set of ‘foundations’, focused heavily on people, process and technology.

For each of these missions and foundations, we agreed a set of specific change goals with a three-to-five-year horizon in mind.

Step 4: Agree principles and a consistent data architecture to guide approach 

In the fourth step, to guide a common approach, drawing on the National Data Strategy and the approach of organisations like the Office for National Statistics, we agreed a set of D&A Principles to set the tone at all levels of the organisation:

  • We take a Bank-wide approach: we build from shared systems, use common data and collaborate on analysis
  • That doesn’t mean one-size fits all. The second principle is to start with business outcomes and use data to support them, including by enabling experts in the business to build the tools they require
  • We manage our data consistently, securely, transparently and ethically, promoting trust, extensive sharing and safe innovation

We have begun to refresh various corporate policies and frameworks to bring these principles to life and ensure they have teeth.

Specific standards on data management and analytical processes are now incorporated within the Bank’s Code. These provide staff with comprehensive guidance on data management and analytical processes and are informed by industry best practice. As well as tightening up local processes, the new policies are helping to populate a central register of core analytical processes and underpinning datasets, supporting their discoverability.

An important and immediately impactful step has been rapid work in Technology to draw up a target data architecture to meet the goals in the strategy and embody the principles so there is compliance with them by design. At its heart is a new Cloud strategy and an ambition to manage the Bank’s data on the cloud unless there is a strong reason not to. The components of the architecture have now been agreed. It is now the North Star for every live project and programme embodying technological change in the Bank, ensuring that each works towards the data strategy.

Our aim is that in time all of our data will be held in one lake or connected to it, described in a single, searchable catalogue, and connected with an integrated suite of analytical tools.

Work started between Data and Technology before Christmas to explore various proof of concepts to test thinking to connect our data and analytics platform to the cloud and we are now moving on to develop a minimum viable product for a cloud platform to create an environment where pilots can be stood up against prioritised use cases in the business.

Step 5: Agree immediate priorities for a data portfolio

Step 5, marshalled by our Change and Planning function, was to prioritise the investment portfolio for the year ahead. This year we took a different approach on the investment portfolio by bringing all the experts and decision makers together at an offsite to agree prioritisation principles and effect them. In the data part of the portfolio, we agreed three priority business-facing programmes for the year ahead:

  • First, and most advanced is the completion of work to move the Bank’s systems for producing, storing and disseminating financial statistics to the Cloud, unlocking productivity gains and new analytical capabilities
  • Second, to renew the system that underpins the Bank’s management and analysis of macroeconomic and financial market data and forecasting to enhance the support provided to the Bank’s decisions on monetary policy.
  • Third, to transform the Bank’s approach to regulatory data collections, through a phased approach to delivery that aligns with the planned ‘Banking Data Review’ of regulatory reporting.

These three business-facing programmes will build on a Bank-wide Data & Analytics (D&A) Modernisation programme, jointly between Data and Technology, tasked with effecting the agreed data strategy, including providing the common governance and cloud platform for others to build from.

Getting the right mix of prioritised set of business facing initiatives, or ‘verticals’, and a capability-focused foundation, or ‘horizontal’, is critical to success. An approach focused solely on capability can risk the build of a platform to nowhere. Having only business facing initiatives could risk an incoherent whole – the creation siloes on the cloud. Having both verticals and a horizontal provides a route to targeting the capability build at important and urgent business needs, whilst maintaining coherence across the organisation.

As well as serving the three large business-facing initiatives, the D&A modernisation programme is managing a controlled set of pilots of AI tools across the Bank. These are targeted against an initial set of tightly defined use cases in different areas of the business, overseen by an AI Taskforce led between Data and Technology. We are choosing these pilots to hone an AI portfolio that has good coverage across different areas of the Bank and to help us explore all areas of recent advances in AI. We are testing both off the shelf, ‘Copilot’ tools, as well as more bespoke applications.

The programme is also driving a revised approach on skills, joint with our People area. Data will be an early pilot of an approach to establish a formal Professions model in the Bank, similar to that used across the civil service. We are using that to define different types of data roles, from scientists, to architects and engineers, to develop learning pathways and sharpen career proposition. We have recently refreshed our data apprenticeships with the aim of increasing numbers and have embedded a week of learning on data and analytics within the graduate programme. We are going beyond a menu of technical courses to a broader data literacy proposition for all roles.

Step 6: Seize the opportunity of change to rework end-to-end operating models

Change presents a golden opportunity to step back and our sixth step is to re-shape ways of working. We are taking a user-centric approach, to identify pain points and missed opportunities in current processes, and applying service-design principles to re-organise fundamentally our processes and set clear requirements for the new systems we are building. The nature of the pain points vary and solutions need to be tailored to those circumstances. Viewed end-to-end the process feeding an ultimate decision can span many different areas of an organisation and can often extend beyond it. Modern technology, from process automation through to artificial intelligence, can unlock both important efficiency gains, enhance analytical capabilities and unlock entirely new possibilities.

In our work to transform statistical production and macro-financial analysis, we are looking afresh at the business processes involved and looking to find efficiency gains and to enhance capabilities.

Most ambitious here are our plans to Transform Data Collection, jointly with the Financial Conduct Authority (FCA). Reporting to the Bank and the FCA is a complex and large activity. In 2019, the annual cost of reporting obligations for UK banks was estimated at between £2 and £4½ bn footnote [5] – an indicator both of the scale of the challenge and the size of potential prize. The aim is to build a system that provides the right data at the lowest possible cost. Though our primary focus, given the imperative of the Banking Data Review, is on data collated from banks, the ambition is to build a new approach and framework that can be applied more broadly to other collections at the Bank. We will publish an update to industry on our joint plans with the FCA for the coming year by the end of the month.

Step 7: Publish and mobilise to execute the plan, to track and manage the risks and the benefits

Our seventh and final step is to publish our plan and mobilise to effect it. We committed as part of the management response to last year’s review to publishing a three-year roadmap with the Bank’s annual report in June. That commitment is already proving a valuable device to focus attention at a senior level on agreeing the plan, including prompt for business areas to think through what they may need in the years ahead.

Execution requires different skillsets to strategy and design. We are currently mobilizing the resources required between our platform engineering teams in technology and partner teams in data and looking also at how we partner externally. We are making sure we have the right structures to manage dependencies, commonalities and sequencing across the data portfolio, and systems to manage the risks and to track the benefits.

An ongoing commitment to report on progress to Court, our Board, will maintain focus on the value created and efficiency gains made and be an important device to managing risks to execution including those that come through broader dependencies.

There you have them. Our next seven steps towards data heaven.

We are bringing and connecting all our data together, modernising our analytical process, upskilling our workforce so all can take advantage of the very latest tools. We are taking the opportunity to look at how we connect externally and re-working our business processes.

Transformation won’t happen overnight, but we will keep at it and report regularly on progress and our process to share our learnings and so you can feedback and hold us to account.

Thank you for listening. 

Acknowledgements

I would like to thank Jasbir Lally and Pooja Prem for helping me to prepare these remarks and Kat Harrington, Dorothy Fouracre, Beth Hall, Will Parry, Susie Philp, Rajveer Berar, Scott Brind, Phoebe Pryor-Hilliard, Rebekah O’Toole and Noor Rassam for feeding in various facts. Work to respond to the IEO review and refresh our data strategy was spearheaded by a great leadership team in the Data and Analytics Transformation Directorate (DAT), including Martine Clark who heads up Data and Statistics Division and brought together our new principles, Peter Eckley who heads up Data Strategy, Paul Robinson who heads up Advanced Analytics and co-chairs our AI taskforce, and David Learmonth who ran the strategy refresh for us. Nathan Monk, William Lovell (the second co-chair of the AI taskforce), Iro Lyra and Rahul Pal in Technology drove work at pace to design our new data architecture and draw up a new cloud strategy. Jo Hill and Rebecca Braidwood in change and planning led the reshape of the Bank’s investment portfolio and our approach to prioritising and managing it. Jane Cathrall and Natasha Oakley in People are working on a new talent offer at the Bank, in which Mohini Subhedar in DAT is piloting data as a profession and a data literacy framework for all roles. Thank you also to Andrew Bailey, Chris Duffy, Huw Pill, Rhys Phillips, Fiona Shaikh for comments.

PSD007 Mortgage Performance Data, 2023 H1.

PRA regulated firms: Which firms does the PRA regulate? | Bank of England .

Decision Maker Panel Data, January 2024: Monthly Decision Maker Panel data - January 2024 | Bank of England .

RTGS and CHAPS annual report: Real-Time Gross Settlement (RTGS) system and CHAPS Annual Report 2022/23 | Bank of England .

See the Future of Finance Report (2019).

speech on big data analytics

James Benford

Executive Director for Data and Analytics Transformation and Chief Data Officer

Other speeches

A rich vein or fool’s gold economic forecasts..., a rich vein or fool’s gold economic forecasts during large shocks − speech by swati dhingra, the importance of central bank reserves by..., the importance of central bank reserves by andrew bailey, the uk’s digital securities sandbox: supporting..., the uk’s digital securities sandbox: supporting the next frontier of innovation− speech by sasha mills, balancing the productivity opportunities..., balancing the productivity opportunities of financial technology and ai against the potential risks −....

Back to top

Cart

  • SUGGESTED TOPICS
  • The Magazine
  • Newsletters
  • Managing Yourself
  • Managing Teams
  • Work-life Balance
  • The Big Idea
  • Data & Visuals
  • Reading Lists
  • Case Selections
  • HBR Learning
  • Topic Feeds
  • Account Settings
  • Email Preferences

Present Your Data Like a Pro

  • Joel Schwartzberg

speech on big data analytics

Demystify the numbers. Your audience will thank you.

While a good presentation has data, data alone doesn’t guarantee a good presentation. It’s all about how that data is presented. The quickest way to confuse your audience is by sharing too many details at once. The only data points you should share are those that significantly support your point — and ideally, one point per chart. To avoid the debacle of sheepishly translating hard-to-see numbers and labels, rehearse your presentation with colleagues sitting as far away as the actual audience would. While you’ve been working with the same chart for weeks or months, your audience will be exposed to it for mere seconds. Give them the best chance of comprehending your data by using simple, clear, and complete language to identify X and Y axes, pie pieces, bars, and other diagrammatic elements. Try to avoid abbreviations that aren’t obvious, and don’t assume labeled components on one slide will be remembered on subsequent slides. Every valuable chart or pie graph has an “Aha!” zone — a number or range of data that reveals something crucial to your point. Make sure you visually highlight the “Aha!” zone, reinforcing the moment by explaining it to your audience.

With so many ways to spin and distort information these days, a presentation needs to do more than simply share great ideas — it needs to support those ideas with credible data. That’s true whether you’re an executive pitching new business clients, a vendor selling her services, or a CEO making a case for change.

speech on big data analytics

  • JS Joel Schwartzberg oversees executive communications for a major national nonprofit, is a professional presentation coach, and is the author of Get to the Point! Sharpen Your Message and Make Your Words Matter and The Language of Leadership: How to Engage and Inspire Your Team . You can find him on LinkedIn and X. TheJoelTruth

Partner Center

PromptCloud

PromptCloud Inc, 16192 Coastal Highway, Lewes De 19958, Delaware USA 19958

  • The Best TED Talks on Big Data ...

The Best TED Talks on Big Data Analytics

circle

Janet Williams

  • April 5, 2017
  • Blog ,   Data

What is TED talk?

The growth of Big Data has been highly inspiring. Over the years, data analytics, extraction, and data visualization have revolutionized numerous sectors across the globe. Irrespective of their nature, operations, and size, business organizations are embracing and leveraging data like never before!

Such increasing popularity of big data and data analytics creates unique opportunities for further research and development. The tech world thought leaders, aspiring entrepreneurs, data analysts, thinkers, and data scientists had come together to unlock new avenues for big data, especially in TED Talks.

These TED talks on Big Data are comprehensive accounts of their thoughts and present a clear picture of what the world’s thinking about the data revolution. The acronym ‘TED’ stands for ‘Technology, Entertainment, and Design.’ TED aims at promoting innovative ideas and discussing path-breaking issues within 18 minutes.

TED wants to reach out to countless people across the globe. Through their official website, TED.com and a highly popular YouTube channel, Technology, Entertainment, and Design let you know all that’s happening around you!

Their official website is flooded with a whopping 556 talks and discussions on data analytics . What’s more interesting is that readers can also add their feedback and suggestions.

Let’s take a look at the top ten TED Talks on data analytics.

1. the birth of a word, by deb roy.

As the CEO and co-founder of Bluefin Labs , Deb Roy researches the underlying theories of human cognition and machine learning. He also directs technical projects at ‘Cognitive Machines Group’ and is currently engaged in learning how kids acquire language and design communicative machines.

2. The Beauty of Data Visualization, by David Mccandles

He is one of the most popular data journalists who love to infer visually appealing conclusions from complicated data sets. David believes in attractive designs and their power to reduce information silos.

3. What we Learned from 5 Million Books, by Erez Lieberman Aiden and Jean-Baptiste Michel

As the Founding Director and one of the significant members of ‘Harvard Cultural Observatory,’ Jean-Baptiste along with his friend and colleague Erez Lieberman Aiden studies human history, culture, and language. He carries out this research with the help of quantitative methods thus unlocking new opportunities for data analytics.

4. Big Data, Better Data, by Kenneth Cukier

Kenneth Cukier works as the Data Editor for ‘The Economist,’ and he has also co-authored the book ‘Big Data: A Revolution That Will Transform How We Live, Work, and Think.’ This man is no less than an encyclopedia when it comes to discussing big data, human cognition, and machine learning.

5. What to do with Big Data, by Susan Etlinger

Susan works with the Altimeter Group, and her job role is an industry analyst. With a keen focus on data analytics, she tries to find out current market trends and new opportunities. As the author of ‘A Framework for Social Analytics’ and ‘’The Social Media ROI Cookbook”, Susan can offer readers crucial insights into the significance and use of data.

6. How Data can Revolutionize the Business Arena, by Philip Evans

Apart from co-authoring ‘Blown to Bits,’ Philip Evans happens to be the managing director and senior partner at the ‘Boston Consulting Group.’ Philip did a lot of research on the role of data scraping and analytics in the business world.

7. Smart Statistics Help you Fight Crime, by Anne Milgram

As a seasoned criminal prosecutor in the US, Anne Milgram shared her thoughts on the role played by data analytics in criminal prosecution. While serving as the Attorney General of New Jersey, Anne came across some of the most crucial insights. She realized that the entire prosecution process relied on individual insights and instinct, which often leads to wrong perceptions.

8. Storytelling and the Birth of Mean Data, by Ben Wellington

From an ace computer scientist to a phenomenal data analyst to a storyteller and blogger, Ben Wellington plays diverse roles. He leverages NYC’s open data thus weaving stories about numerous aspects of the place.

9. Relationship Analytics & Business, by Zack Johnson

‘Syndio Social’ is the creative brainchild of Zack Johnson. His company helps business organizations across the world to embrace and adopt change. Zack took up social network analysis for his higher studies and worked on crucial projects for NASA, the US Army, MacArthur Foundation, and more.

10. Human-Computer Cooperation, by Shyam Sankar

Shyam Sankar is one of the most important personalities in this list. He’s the director of ‘Palantir Technologies’ – a data analysis firm. Sankar’s company helps business organizations and law enforcement units analyze crucial data thus helping them stay ahead of market competition.

Leveraging the Power of Data

All said and done, data analytics has an important role to play in the present and future world. With tech innovations giving birth to revolutions, data analytics is turning into a necessity for businesses.

Parting Thoughts

These TED Talks aim at capturing this growth and development of data. Thought leaders come together to break down the key findings associated with big data analytics.

The world is a place where huge amounts of open data are waiting to be analyzed. With effective suggestions from TED analysts, business organizations, independent workers, and creative thinkers will have better opportunities to perform and excel.

Sharing is caring!

Recent post

Dynamic web scraping python

Dynamic Web Page Scraping with Python –

  • June 7, 2024

crawling websites for big data

Effective Web Crawling Techniques for Big Data

  • June 6, 2024

Data Management Tools

Our Picks for the Top 10 Big

  • June 5, 2024

learn web scraping

Learn Web Scraping – A Guide by

Top Data Extraction Tools 2024 | PromptCloud

Top Data Extraction Tools 2024 – A

  • 14 min read

online shopping tips

Online Shopping Tips – A Guide

More from blog.

Dynamic web scraping python

Are you looking for a custom data extraction service?

bubble-12

  • Name * First Last
  • Company Name *
  • Contact Number *
  • Company Email *
  • What data type are you looking for? What type of data do you need? Ecommerce Product Data Travel Data Data for AI Jobs Data News, Articles, and Forums Product Reviews Real Estate Listings Airline Data Hotel Listings and Pricing Data Social Media Data Market Research and Analytics Automobile Data Image Extraction Others Please select the data type your project needs
  • Requirements *
  • I consent to having this website store my submitted information so they can respond to my inquiry.
  • Hidden Tags
  • Hidden CTA Type
  • Phone This field is for validation purposes and should be left unchanged.

Please fill up all the fields to submit

Centilio Logo

The Ultimate Guide to Speech Analytics: 5 Power Steps to Understanding Voice Data

Ravi Gandhi

A comprehensive guide to harnessing speech analytics for insightful voice data analysis

A comprehensive guide to harnessing speech analytics for insightful voice data analysis

Introduction

In today’s digital age, voice has become one of the most pivotal mediums for communication. From customer support to business meetings, we’re constantly speaking, and behind those words lies a goldmine of data. Speech analytics is the tool that helps us unearth this treasure, making sense of spoken words to derive actionable insights. This guide will walk you through the intricacies of speech analytics, why it’s a game-changer, and how businesses can harness its power.

What is Speech Analytics?

The process of analysing recorded calls or real-time conversations to gather insights is what speech analytics is all about. It’s like having a magnifying glass for spoken words, where you can spot trends, sentiments, and even future predictions.

– The Technology Behind It: Using advanced algorithms and artificial intelligence, speech analytics tools dissect voice data, converting it into text, and then analysing it for patterns.

– Applications in Business: From customer service to sales, businesses across the board use speech analytics to enhance performance metrics and improve overall customer satisfaction.

Benefits of Speech Analytics

Understanding voice data isn’t just about numbers; it’s about extracting real value from conversations. The perks of speech analytics are numerous, and here are some that stand out:

– Enhancing Customer Experience : By pinpointing areas of discontent or patterns in customer behaviour, companies can tailor experiences to suit their audience’s needs.

– Operational Efficiency: Identifying best practices, training opportunities, and operational bottlenecks becomes a breeze with speech analytics.

– Risk Management: Detect compliance issues or potential risks before they escalate into bigger problems.

Challenges in Implementing Speech Analytics

No technology comes without its set of challenges. While speech analytics is a powerhouse of insights, it also brings along its fair share of hurdles:

– Data Privacy Concerns : As conversations are recorded, businesses need to be wary of data protection regulations and ensure customer privacy is maintained.

– Interpreting Emotions: While the technology is advanced, deciphering the true sentiment behind words can sometimes be a complex task.

– Infrastructure Requirements: Setting up the tools and systems for effective speech analytics may require significant investment.

Future of Speech Analytics

We’re just scratching the surface when it comes to harnessing the full potential of speech analytics. The future holds promising developments:

– Integration with Other Data Sources: Imagine combining voice data with other data points like purchase history or browsing habits. The insights would be unmatched.

– Real-time Analytics: Instead of post-call analysis, real-time feedback could change the way customer support or sales calls are handled.

– Emotion Recognition: Advanced algorithms might soon be able to pick up not just words, but also the emotions conveyed through tone and inflection.

Best Practices in Speech Analytics

If you’re considering diving into the world of speech analytics, here are some golden rules to keep in mind:

– Always Inform Participants: Before recording, ensure all parties are aware. It’s not just a best practice, but often a legal requirement.

– Combine with Other Metrics: Don’t rely solely on voice data. Combine it with other metrics for a holistic view.

– Continuous Training: As with any technology, continuous training and updates are essential to get the most out of speech analytics.

Speech Analytics Vs. Traditional Analytics

The world of analytics is vast. While traditional analytics relies on quantifiable metrics like sales numbers or website traffic, speech analytics delves deeper:

– Quantitative Vs. Qualitative: While traditional analytics gives you hard numbers, speech analytics provides insights into sentiments, emotions, and behaviour.

– Reactive Vs. Proactive: Traditional analytics is often reactive, while speech analytics allows businesses to be proactive, identifying issues or opportunities in real-time.

How to Choose a Speech Analytics Tool

The market is flooded with tools, but how do you pick the right one for your needs?

– Define Your Goals: Know what you want to achieve – be it improving customer service, increasing sales, or ensuring compliance.

– Check Integration Capabilities: Ensure the tool can seamlessly integrate with your existing systems.

– Prioritise User-Friendliness: A tool is only as good as its usability. Choose one that’s user-friendly and doesn’t have a steep learning curve.

A comprehensive guide to harnessing speech analytics for insightful voice data analysis

  • What are the main components of speech analytics?  

At its core, speech analytics involves capturing voice data, converting it to text, analysing the content, and deriving insights based on patterns, sentiments, and behaviours.

  • Why is speech analytics important for businesses?  

It provides unparalleled insights into customer behaviour, identifies areas for improvement, ensures compliance, and can even predict future trends.

  • Can speech analytics detect emotions?  

While it can give indications based on words used, true emotion detection, considering tone and inflection, is still in its developmental stages.

  • Is there a risk of breaching privacy with speech analytics?  

Yes, businesses need to ensure they’re compliant with data protection regulations and always inform participants before recording.

  • How accurate is speech analytics?  

While advancements in AI have significantly improved accuracy, the efficiency often depends on the tool used and the clarity of the recorded conversations.

  • Does speech analytics work in real-time?  

While many tools offer post-call analysis, there’s a growing trend towards real-time analytics, offering immediate feedback during calls.

In the expansive realm of analytics, speech analytics stands out as a beacon of qualitative insights. As businesses continue to evolve in the digital age, understanding voice data will become indispensable. Whether you’re a seasoned business leader or just starting, tapping into the potential of speech analytics could be the key to unlocking unprecedented growth.

External Links/ Sources:

Speech analytics

What is speech analytics and how does it benefit businesses?

From speech to insights: The value of the human voice

The Comprehensive Guide to Environmental Analysis: 5 Must-Know Factors

The ultimate 7-step guide to brilliant survey design: unlocking the power of data, leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Sign up for a free trial today!

Centilio’s end-to-end marketing platform solves every marketing need of your organization.

TRY FOR FREE

Deleting your Account

Add a contact in centilio, accessing the sign journey.

© 2023 Centilio Inc, All Rights Reserved.

qat global

Individual & Team Staff Augmentation

Technology consulting & digital advisory, software reengineering & support.

  • Mobile Application Development
  • Software Product Development
  • Web Development
  • IT Outsourcing – Brazil
  • IT Outsourcing – Costa Rica Offshoring
  • Cross-Sourcing
  • Nearshore Outsourcing
  • Rural Sourcing
  • Software Development Lab
  • Software Development Team
  • Business Process Automation
  • CA Gen Solutions
  • Customized Development Strategy
  • Software Product Ideation Services & Solutions for Enterprise Companies
  • Strategic Technology Consulting Services for Enterprise Companies
  • Technology Advisory Services for Enterprise Companies
  • Altova Reseller
  • Legacy Software Systems
  • Enterprise Application Integration
  • Open Source Software Customization
  • Quality Assurance and Testing Services

Service Delivery

  • .NET Technology Stack
  • Cloud Computing
  • IoT – Internet of Things
  • Custom Java Software Development
  • Microservices
  • Open Source
  • Progressive Web Applications
  • User Experience Design (UX)
  • SaaS Development Services
  • Digital Transformation
  • End-to-End Business Solutions
  • PWA Solutions
  • Business Software
  • Enterprise Software Applications
  • Working With Us – Our Customer Engagement Process
  • Delivery Manager Practice
  • Engagement Models
  • QAT Global Advantage
  • QAT Global’s Client Portal
  • Banking & Financial Services
  • Government Solutions
  • Life Sciences & Healthcare
  • Transportation & Logistics
  • Utilities & Energy Services
  • QAT Global Insights Blog Explore QAT Global Insights Blog, a dynamic hub for CEOs, engineers, designers, and entrepreneurs. Dive into concise articles covering methodologies, engagement models, modernization, and cutting-edge technologies. Stay updated on industry trends, challenges, and innovative solutions, empowering your enterprise to thrive.
  • Infographics Infographics
  • Executive Leadership
  • Our Clients
  • Job Openings
  • Search for:

10 Revealing TED Talks on Big Data and Analytics

10 Revealing TED Talks on Big Data and Analytics

1. ‘The birth of a word’: Deb Roy (19:52)

2. ‘the beauty of data visualization’: david mccandless (18:17), 3. ‘what we learned from 5 million books’: erez lieberman aiden and jean-baptiste michel (14:08), 4. big data, better data: kenneth cukier (15:55), 5. what to do with big data: susan etlinger (12:27), 6. how data can revolutionize the business arena: philip evans (13:57), 7. smart statistics help you fight crime: anne milgram (12:41).

  • 8. Cathy O’Neil: The era of blind faith in big data must end (13:18)

9. ‘Relationship Analytics & Business’: Zack Johnson (14:46)

10. human-computer cooperation: shyam sankar (12:12), unlock the potential of big data, qat global shares 10 ted talks on big data and analytics, dive deeper into the fascinating world of big data analytics by learning from these knowledgeable experts..

The rise of the computing age gave humans the ability to collect more information than they ever thought possible. Data is everywhere, but finding insights and plans of action from these massive amounts of information has created a unique challenge. Welcome to the fascinating world of big data analytics. Learn how the smartest business leaders, entrepreneurs, data analysts, and everyday people can benefit from harnessing the power of big data. TED is a nonprofit devoted to spreading ideas, usually in the form of short, powerful talks (18 minutes or less). TED began in 1984 as a conference where Technology, Entertainment, and Design converged, and today covers almost all topics — from science to business to global issues — in more than 100 languages. Meanwhile, independently run TEDx events to help share ideas in communities around the world. Let’s take a look at the top 10 TED Talks on the fascinating topic of Big Data Analytics!

MIT researcher and CEO and co-founder of Bluefin Labs, Deb Roy wanted to understand how human language is formed. His background in human cognition and machine learning prompted him to embark on a unique experiment with his own son. Over the course of three years, Roy compiled 90,000 hours of video and 140,000 hours of audio data that helped map out the origin of a word.

This innovative linguistic experiment has since expanded into the world of mass media. The implications for big data-driven applications continue to show how humans learn and interact on a global platform. https://youtu.be/RE4ce4mexrU

Data journalist, David McCandless, had a very important question to answer when it comes to complicated data. What does it all mean? He set out to transform complex data like military spending, media hits, and Face-book status updates into easy-to-understand diagrams. He believes changing data into visual landscapes is the best way to navigate, analyze, and understand big data. Data visualization is the future of how we interpret and react to data patterns in our own lives. https://youtu.be/5Zg-C8AAIGg

Jean-Baptiste Michel and Erez Lieberman Aiden of the Harvard Cultural Observatory, set out to study human history, culture, and language with the help of big data analytics. With the help of Google Labs’ NGram Viewer, they were able to make fascinating connections on the evolution of cultural trends. Their research, through the help of quantitative methods, displays the power that big data analytics has on the development of human ideas. https://youtu.be/5l4cA8zSreQ

As the Data Editor of ‘The Economist’, Kenneth Cukier is very familiar with the power of big data. He believes that the power of data allows us to see ‘different’. What exactly does that mean? Cukier answers this question by showing everybody the immense global impact that rides on the future of big data analytics and machine learning. Learn how humans can turn static data into something that is fluid and dynamic. https://youtu.be/8pHzROP1D-w

Susan Etlinger, an industry analyst with the Altimeter Group, loves big data and its use in market analytics. As we receive more and more data, Susan reminds us that it is important to think critically about that data. As she states, “Facts are stupid things”, and we have the challenge to create meaning out data sets. In this talk, she challenges us to go beyond the data and truly understand it. https://youtu.be/AWPrOvzzqZk

Philip Evens, a senior partner at the ‘Boston Consulting Group’, lets the audience know of the importance that big data analytics has in the business world. In this discussion, he unities two key ideas in business strategy, Bruce Henderson’s ‘increasing returns to scale’ and Michael Porter’s ‘value chain’. Evans explores the notion that big data is the next big step in the evolution of forming new competitive advantages in business. https://youtu.be/EHTmxmuhZ10

Anne Milgram, the attorney general of New Jersey in 2007, looks into the key role big data analytics could play in criminal prosecution. She wanted to understand who they were arresting, who they were charging, and who they were putting in our nation’s jails. In her discussion, she believes that crime cannot be solved on insights and instincts alone. She reiterates the importance of using big data analytic techniques to solve and prevent crime. https://youtu.be/ZJNESMhIxQ0

8. Cathy O’Neil: The era of blind faith in big data must end (13:18)

Mathematician and data scientist Cathy O’Neil reveals the problem with relying too much on Big Data. In her discussion, she asks the question, “What if the algorithms are wrong?” Departments and companies that rely too much on data algorithms are creating unfair bias in the workplace. Learn more about how these unfair algorithms are what she calls, “weapons of math destruction”. https://www.ted.com/talks/cathy_o_neil_the_era_of_blind_faith_in_big_data_must_end

Zack Johnson is the co-founder of Syndio Social— a company that helps organizations adopt change. Adopting change can be the difference between organizations making billions of dollars or being obsolete. In his discussion, Zack mentions the importance of how social network analysis reveals better and more efficient practices for businesses. Learn more about the fascinating world of relationship analytics. https://youtu.be/MLaQotcqOxo

Shayam Sankar is the director of Palantir Technologies—a data analysis firm that helps businesses use data analytics to sustain a competitive advantage. In this discussion he explains why relying on brute force computing is not efficient. He proposes that algorithms and computing power need human cooperation to make clearer more accurate results. Learn more about how man and machine cooperation will deliver the best insights. https://youtu.be/ltelQ3iKybU

As these experts have mentioned, big data is all around us. To gain value and insight from big data analytics , organizations need the ability not just to process the vast quantities of data being generated, but also to blend the right datasets together to give context and meaning. Big data goes beyond business intelligence and analytics, identifying which people or machines in your organization need the insight and how to get it to them at speed and securely is the difference between competitive advantage and increased operational costs. (C to A) Your business is special and so is your data. Let the analytics experts at QAT Global create real-time insights that will give your business the ultimate competitive advantage over the competition.

Are your software needs evolving faster than your solutions?

Let's bridge the gap together. Whether it's custom software development or finding the perfect IT talent for your team, QAT Global is here to turn your challenges into opportunities. Start the conversation today and discover how we can help you accomplish your goals.

10 Revealing TED Talks on Big Data and Analytics

Building Solutions, Delivering Talent

+1 800 799 8545.

BBB Seal

New @ QAT Global

Mastering the art of communication for tech executives, financial implications of ai assessing the costs and benefits of developing genai products, low-risk, high-reward: best ai applications for enterprises starting their ai journey, ux in enterprise applications, visit our other sites.

Let’s Talk!

+1 (800) 799 8545

Start the conversation.

Let’s talk and see if we can make things happen for you. Complete the form and we’ll reach out.

If you prefer to talk right now, call us at 402-391-9200.

Let’s Make Things Happen

Schedule a free consultation with the presales team to discuss your project or staffing needs..

QAT Global

Company Filings | More Search Options

Company Filings More Search Options -->

SEC Emblem

  • Commissioners
  • Reports and Publications
  • Securities Laws
  • Commission Votes
  • Corporation Finance
  • Enforcement
  • Investment Management
  • Economic and Risk Analysis
  • Trading and Markets
  • Office of Administrative Law Judges
  • Examinations
  • Litigation Releases
  • Administrative Proceedings
  • Opinions and Adjudicatory Orders
  • Accounting and Auditing
  • Trading Suspensions
  • How Investigations Work
  • Receiverships
  • Distributions to Harmed Investors
  • Rulemaking Activity
  • Proposed Rules
  • Final Rules
  • Interim Final Temporary Rules
  • Other Orders and Notices
  • Self-Regulatory Organizations
  • Investor Education
  • Small Business Capital Raising
  • EDGAR – Search & Access
  • EDGAR – Information for Filers
  • Company Filing Search
  • How to Search EDGAR
  • About EDGAR
  • Press Releases
  • Speeches and Statements
  • Securities Topics
  • Upcoming Events
  • Media Gallery
  • Divisions & Offices
  • Public Statements

The Role of Big Data, Machine Learning, and AI in Assessing Risks: a Regulatory Perspective

Scott W. Bauguess, Acting Director and Acting Chief Economist, DERA

Champagne Keynote Address: <br>OpRisk North America 2017, New York, New York

June 21, 2017

Thank you, Alexander [Campbell] for the introduction.

Thanks also to Genevieve Furtado and the other conference organizers for the invitation to speak here today, at the 19 th Annual Operational Risk North America Conference. I understand that this is the Champagne Keynote address. Given that title, I feel obligated as an economist to share with you the reported last words of John Maynard Keynes – the father of modern macroeconomics: “I should have drunk more champagne.” I hope my words here today do not inspire a similar sentiment. And finally, I must remind you that the views that I express today are my own and do not necessarily reflect the views of the Commission or its staff. [1]

My remarks this afternoon will center on a technology topic that is encroaching on many aspects of our lives and increasingly so within financial markets: Artificial Intelligence. Perhaps better known by its two-letter acronym “AI,” artificial intelligence has been the fodder of science fiction writing for decades. But the technology underlying AI research has recently found applications in the financial sector – in a movement that falls under the banner of “Fintech.” And the same underlying technology [machine learning and AI] is fueling the spinoff field of “Regtech,” to make compliance and regulatory-related activities easier, faster, and more efficient.  

This is the first time that I have addressed the emergence of AI in one of my talks. But I have spoken previously on the two core elements that are allowing the world to wonder about its future: big data and machine learning. [2] Like many of your institutions, the Commission has made recent and rapid advancements with analytic programs that harness the power of big data. They are driving our surveillance programs and allowing innovations in our market risk assessment initiatives. And the thoughts I’m about to share reflect my view on the promises – and also the limitations – of machine learning, big data, and AI in market regulation.

Perhaps a good place to begin is with a brief summary of where we were, at the Commission, 2 years ago. I remember well, because it was then that I was invited to give a talk at Columbia University on the role of machine learning at the SEC. I accepted the invitation with perhaps less forethought than I should have had. I say this because I soon found myself googling the definition of machine learning. And the answers that Google returned—and I say answers in plural, because there seem to be many ways to define it—became the first slide of that presentation. [3]  

The Science of Machine Learning and the Rise of Artificial Intelligence

Most definitions of machine learning begin with the premise that machines can somehow learn. And the central tenets of machine learning, and the artificial intelligence it implies, have been around for more than a half a century. Perhaps the best known, early application was in 1959, when Arthur Samuel, an IBM scientist, published a solution to the game of checkers. For the first time, a computer could play checkers against a human and win. [4] This is now also possible with the board game “Go,” which has been around for 2,500 years and is purported to be more complicated and strategic than Chess. Twenty years ago, it was widely believed that a computer could never defeat a human in a game of “Go.” This belief was shattered in 2016, when AlphaGo, a computer program, took down an 18-time world champion in a best-of-seven match. [5] The score:  4 to 1.  

Other recent advancements in the area of language translation are equally, if not more, impressive.  Today, if the best response to my question on the definition of machine learning is in Japanese, Google can translate the answer to English with an amazing degree of clarity and accuracy. Pull out your smart phone and try it. Translate machine learning into Japanese. Copy and paste the result into your browser search function. Copy and paste the lead paragraph of the first Japanese language result back into Google Translate. The English language translation will blow your mind. What would otherwise take a lifetime of learning to accomplish comes back in just a few seconds. 

The underlying science is both remarkable and beyond the scope of this talk. [6] (Not to mention my ability to fully explain it.) But it is not too difficult to understand that the recent advancements in machine learning are shaping how AI is evolving. Early AI attempts used computers to mimic human behavior through rules-based methods, which applied logic-based algorithms that tell a computer to “do this if you observe that.” Today, logic-based machine learning is being replaced with a data-up approach. And by data-up, I mean programming a computer to learn directly from the data it ingests. Using this approach, answers to problems are achieved through recognition of patterns and common associations in the data.  And they don’t rely on a programmer to understand why they exist.  Inference, a prerequisite to a rule, is not required.  Instead, tiny little voting machines, powered by neural networks, survey past quantifiable behaviors and compete on the best possible responses to new situations. 

If you want a tangible example of this, think no further than your most recent online shopping experience. Upon the purchase of party hats, your preferred retailer is likely to inform you that other shoppers also purchased birthday candles. Perhaps you need them too? Behind this recommendation is a computer algorithm that analyzes the historical purchasing patterns from you and other shoppers.  From this, it then predicts future purchasing-pair decisions. The algorithm doesn’t care why the associations exist. It doesn’t matter if the predictions don’t make intuitive sense. The algorithm just cares about the accuracy of the prediction. And the algorithm is continually updating the predictions as new data arrives and new associations emerge.

This data-driven approach is far easier to apply and is proving in many cases to be more accurate than the previous logic-based approaches to machine learning. But how does it help a market regulator to know that purchasers of protein powder may also need running shoes?

The simple, and perhaps obvious, answer is that regulators can benefit from understanding the likely outcomes of investor behaviors. The harder truth is that applying machine learning methods is not always simple. Outcomes are often unobservable. Fraud, for example, is what social scientists call a latent variable. You don’t see it until it’s found. So, it is more challenging for machine learning algorithms to make accurate predictions of possible fraud than shopping decisions, where retailers have access to full transaction histories—that is, complete outcomes for each action. The same is true for translating languages; there is an extremely large corpus of language-pair translations for an algorithm to study and mimic.   

Two years ago, tackling these types of issues at the Commission was still on the horizon. But a lot of progress has been made since then, and machine learning is now integrated into several risk assessment programs—sometimes in ways we didn’t then envision. I’m about to share with you some of these experiences. But let me preview now, that while the human brain will continue to lose ground to machines, I don’t believe it will ever be decommissioned with respect to the regulation of our financial markets. 

The Rise of Machine Learning at the Commission

Let me start by giving you some background on staff’s initial foray into the fringes of machine learning, which began shortly after the onset of the financial crisis. That is when we first experimented with simple text analytic methods. This included the use of simple word counts and something called regular expressions, which is a way to machine-identify structured phrases in text-based documents. In one of our first tests, we examined corporate issuer filings to determine whether we could have foreseen some of the risks posed by the rise and use of credit default swaps [CDS] contracts leading up to the financial crisis. We did this by using text analytic methods to machine-measure the frequency with which these contracts were mentioned in filings by corporate issuers. We then examined the trends across time and across corporate issuers to learn whether any signal of impending risk emerged that could have been used as an early warning. 

This was a rather crude proof-of-concept. And it didn’t work exactly as intended. But it did demonstrate that text analytic methods could be readily applied to SEC filings. Our analysis showed that the first mention of CDS contracts in a Form 10-K was by three banks in 1998.  By 2004, more than 100 corporate issuers had mentioned their use. But the big increase in CDS disclosures came in 2009. This was, of course, after the crisis was in full swing. And identification of those issues by the press wasn’t much earlier. We analyzed headlines, lead paragraphs, and the full text of articles in major news outlets over the years leading up to the financial crisis and found that robust discussions of CDS topics did not occur until 2008. During that year, we found a ten-fold increase in CDS articles relative to the prior year.   

Use of Natural Language Processing

Even if the rise in CDS disclosure trends had predated the crisis, we still would have needed to know to look for it. You can’t run an analysis on an emerging risk unless you know that it is emerging. So this limitation provided motivation for the next phase of our natural language processing efforts. This is when we began applying topic modeling methods, such as latent dirichlet allocation to registrant disclosures and other types of text documents. LDA, as the method is also known, measures the probability of words within documents and across documents, in order to define the unique topics that they represent. [7] This is what the data scientist community calls “unsupervised learning.” You don’t have to know anything about the content of the documents. No subject matter expertise is needed. LDA extracts insights from the documents, themselves using the data-up approach to define common themes – these are the topics – and report on where, and to what extent, they appear in each document.

One of our early topic modeling experiments analyzed the information in the tips, complaints, and referrals (also referred to as TCRs) received by the SEC. The goal was to learn whether we could classify themes directly from the data itself and in a way that would enable more efficient triaging of TCRs. In another experiment, DERA – the Division of Economic and Risk Analysis – research staff examined whether machine learning could digitally identify abnormal disclosures by corporate issuers charged with wrongdoing. DERA research staff found that when firms were the subject of financial reporting-related enforcement actions, they made less use of an LDA-identified topic related to performance discussion. This result is consistent with issuers charged with misconduct playing down real risks and concerns in their financial disclosure. [8]   

These machine learning methods are now widely applied across the Commission. Topic modeling and other cluster analysis techniques are producing groups of “like” documents and disclosures that identify both common and outlier behaviors among market participants. These analyses can quickly and easily identify latent trends in large amounts of unstructured financial information, some of which may warrant further scrutiny by our enforcement or examination staff. 

Moreover, working with our enforcement and examination colleagues, DERA staff is able to leverage knowledge from these collaborations to train the machine learning algorithms. This is referred to as “supervised” machine learning. These algorithms incorporate human direction and judgment to help interpret machine learning outputs. For example, human findings from registrant examinations can be used to “train” an algorithm to understand what pattern, trend, or language in the underlying examination data may indicate possible fraud or misconduct. More broadly, we use unsupervised algorithms to detect patterns and anomalies in the data, using nothing but the data, and then use supervised learning algorithms that allow us to inject our knowledge into the process; that is, supervised learning “maps” the found patterns to specific, user-defined labels. From a fraud detection perspective, these successive algorithms can be applied to new data as it is generated, for example from new SEC filings. When new data arrives, the trained “machine” predicts the current likelihood of possible fraud on the basis of what it learned constituted possible fraud from past data. 

An Example of Machine Learning To Detect Potential Investment Adviser Misconduct

Let me give you a concrete example in the context of the investment adviser space. DERA staff currently ingests a large corpus of structured and unstructured data from regulatory filings of investment advisers into a Hadoop computational cluster. This is one of the big data computing environments we use at the Commission, which allows for the distributed processing of very large data files. Then DERA’s modeling staff takes over with a two-stage approach. In the first, they apply unsupervised learning algorithms to identify unique or outlier reporting behaviors. This includes both topic modeling and tonality analysis. Topic modeling lets the data define the themes of each filing. Tonality analysis gauges the negativity of a filing by counting the appearance of certain financial terms that have negative connotations. [9] The output from the first stage is then combined with past examination outcomes and fed into a second stage [machine learning] algorithm to predict the presence of idiosyncratic risks at each investment adviser.   

The results are impressive. Back-testing analyses show that the algorithms are five times better than random at identifying language in investment adviser regulatory filings that could merit a referral to enforcement. But the results can also generate false positives or, more colloquially, false alarms.  In particular, identification of a heightened risk of misconduct or SEC rule violation often can be explained by non-nefarious actions and intent. Because we are aware of this possibility, expert staff knows to critically examine and evaluate the output of these models. But given the demonstrated ability of these machine learning algorithms to guide staff to high risk areas, they are becoming an increasingly important factor in the prioritization of examinations. This enables the deployment of limited resources to areas of the market that are most susceptible to possible violative conduct.

The Role of Big Data

It is important to note that all of these remarkable advancements in machine learning are made possible by, and otherwise depend on, the emergence of big data. The ability of a computer algorithm to generate useful solutions from the data relies on the existence of a lot of data. More data means more opportunity for a computer algorithm to find associations. And as more associations are found, the greater the accuracy of predictions. Just like with humans, the more experience a computer has, the better the results will be. 

This trial-and-error approach to computer learning requires an immense amount of computer processing power. It also requires specialized processing power, designed specifically to enhance the performance of machine learning algorithms. The SEC staff is currently using these computing environments and is also planning to scale them up to accommodate future applications that will be on a massive scale. For instance, market exchanges will begin reporting all of their transactions through the Consolidated Audit Trail system, also known as CAT, starting in November of this year. [10] Broker-dealers will follow with their orders and transactions over the subsequent 2 years. This will result in data about market transactions on an unprecedented scale. And, making use of this data will require the analytic methods we are currently developing to reduce the enormous datasets into usable patterns of results, all aimed to help regulators improve market monitoring and surveillance.

We already have some experience with processing big transaction data. Using, again, our big data technologies, such as Hadoop computational clusters that are both on premises and available through cloud services, we currently process massive datasets. One example is the Option Pricing Reporting Authority data, or OPRA data. To help you grasp the size of the OPRA dataset, one day’s worth of OPRA data is roughly two terabytes. To illustrate the size of just one terabyte, think of 250 million, double-sided, single-spaced, printed pages. Hence, in this one dataset, we currently process the equivalent of 500 million documents each and every day. And we reduce this information into more usable pieces of information, including market quality and pricing statistics.

However, with respect to big data, it is important to note that good data is better than more data.  There are limits to what a clever machine learning algorithm can do with unstructured or poor-quality data. And there is no substitute for collecting information correctly at the outset. This is on the minds of many of our quant staff. And it marks a fundamental shift in the way the Commission has historically thought about the information it collects. For example, when I started at the Commission almost a decade ago, physical paper documents and filings dominated our securities reporting systems. Much of it came in by mail, and some [documents] still come to us in paper or unstructured format. But this is changing quickly, as we are continuing to modernize the collection and dissemination of timely, machine-readable, structured data to investors. [11]

The staff is also cognizant of the need to continually improve how we collect information from registrants and other market participants, whether it is information on security-based swaps, equity market transactions, corporate issuer financial disclosures, or investment company holdings. We consider many factors, such as the optimal reporting format, frequency of reporting, the most important data elements to include, and whether metadata should be collected by applying a taxonomy of definitions to the data. We consider these factors each and every time the staff makes a recommendation to the Commission for new rules, or amendments to existing rules, that require market participant or SEC-registrant reporting and disclosures. 

The Future of Artificial Intelligence at the Commission

So, where does this leave the Commission with respect to all of the buzz about artificial intelligence?

At this point in our risk assessment programs, the power of machine learning is clearly evident. We have utilized both machine learning and big data technologies to extract actionable insights from our massive datasets. But computers are not yet conducting compliance examinations on their own. Not even close. Machine learning algorithms may help our examiners by pointing them in the right direction in their identification of possible fraud or misconduct, but machine learning algorithms can’t then prepare a referral to enforcement. And algorithms certainly cannot bring an enforcement action. The likelihood of possible fraud or misconduct identified based on a machine learning predication cannot – and should not – be the sole basis of an enforcement action. Corroborative evidence in the form of witness testimony or documentary evidence, for example, is still needed. Put more simply, human interaction is required at all stages of our risk assessment programs.

So while the major advances in machine learning have and will continue to improve our ability to monitor markets for possible misconduct, it is premature to think of AI as our next market regulator.  The science is not yet there. The most advanced machine learning technologies used today can mimic human behavior in unprecedented ways, but higher-level reasoning by machines remains an elusive hope. 

I don’t mean for these remarks to be in any way disparaging of the significant advancements computer science has brought to market assessment activities, which have historically been the domain of the social sciences. And this does not mean that the staff won’t continue to follow the groundbreaking efforts that are moving us closer to AI. To the contrary, I can see the evolving science of AI enabling us to develop systems capable of aggregating data, assessing whether certain Federal securities laws or regulations may have been violated, creating detailed reports with justifications supporting the identified market risk, and forwarding the report outlining that possible risk or possible violation to Enforcement or OCIE staff for further evaluation and corroboration.

It is not clear how long such a program will take to develop. But it will be sooner than I would have imagined 2 years ago. And regardless of when, I expect that human expertise and evaluations always will be required to make use of the information in the regulation of our capital markets. For it does not matter whether the technology detects possible fraud, or misconduct, or whether we train the machine to assess the effectiveness of our regulations – it is SEC staff who uses the results of the technologies to inform our enforcement, compliance, and regulatory framework. 

Thank you for your time today.

[1] The Securities and Exchange Commission, as a matter of policy, disclaims responsibility for any private publication or statement by any of its employees. The views expressed herein are those of the author and do not necessarily reflect the views of the Commission or of the author’s colleagues on the staff of the Commission. I would like to thank Vanessa Countryman, Marco Enriquez, Christina McGlosson-Wilson, and James Reese for their extraordinary help and comments. 

[2] SEC Speech, Has Big Data Made us Lazy?, Midwest Region Meeting of the American Accounting Association, October 2016. https://www.sec.gov/news/speech/bauguess-american-accounting-association-102116.html .

[3] http://cfe.columbia.edu/files/seasieor/center-financial-engineering/presentations/MachineLearningSECRiskAssessment030615public.pdf .

[4] Arthur Samuel, 1959, Some Studies in Machine Learning Using the Game of Checkers. IBM Journal 3, (3): 210-229.

[5] https://en.wikipedia.org/wiki/AlphaGo_versus_Lee Sedol .

[6] For an excellent layperson discussion on how machine learning is enabling all of this, see , e.g., Gideon Lewis- Kraus, The New York Times, December 14, 2016, The Great A.I. Awakening.

[7] See http://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf .

[8] See, G. Hoberg and C. Lewis, 2017, Do Fraudulent Firms Produce Abnormal Disclosure? Journal of Corporate Finance, Vol. 43, pp. 58-85. 

[9] Loughran, Tim, and McDonald, Bill, 2011. When is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10-Ks. Journal of Finance 66: 35–65.

[10] See, e.g., https://www.sec.gov/divisions/marketreg/rule613-info.htm .

[11] Securities and Exchange Commission Strategic Plan Fiscal years 2014-2018, https://www.sec.gov/about/sec-strategic-plan-2014-2018.pdf .

About Stanford GSB

  • The Leadership
  • Dean’s Updates
  • School News & History
  • Commencement
  • Business, Government & Society
  • Centers & Institutes
  • Center for Entrepreneurial Studies
  • Center for Social Innovation
  • Stanford Seed

About the Experience

  • Learning at Stanford GSB
  • Experiential Learning
  • Guest Speakers
  • Entrepreneurship
  • Social Innovation
  • Communication
  • Life at Stanford GSB
  • Collaborative Environment
  • Activities & Organizations
  • Student Services
  • Housing Options
  • International Students

Full-Time Degree Programs

  • Why Stanford MBA
  • Academic Experience
  • Financial Aid
  • Why Stanford MSx
  • Research Fellows Program
  • See All Programs

Non-Degree & Certificate Programs

  • Executive Education
  • Stanford Executive Program
  • Programs for Organizations
  • The Difference
  • Online Programs
  • Stanford LEAD
  • Seed Transformation Program
  • Aspire Program
  • Seed Spark Program
  • Faculty Profiles
  • Academic Areas
  • Awards & Honors
  • Conferences

Faculty Research

  • Publications
  • Working Papers
  • Case Studies

Research Hub

  • Research Labs & Initiatives
  • Business Library
  • Data, Analytics & Research Computing
  • Behavioral Lab

Research Labs

  • Cities, Housing & Society Lab
  • Golub Capital Social Impact Lab

Research Initiatives

  • Corporate Governance Research Initiative
  • Corporations and Society Initiative
  • Policy and Innovation Initiative
  • Rapid Decarbonization Initiative
  • Stanford Latino Entrepreneurship Initiative
  • Value Chain Innovation Initiative
  • Venture Capital Initiative
  • Career & Success
  • Climate & Sustainability
  • Corporate Governance
  • Culture & Society
  • Finance & Investing
  • Government & Politics
  • Leadership & Management
  • Markets and Trade
  • Operations & Logistics
  • Opportunity & Access
  • Technology & AI
  • Opinion & Analysis
  • Email Newsletter

Welcome, Alumni

  • Communities
  • Digital Communities & Tools
  • Regional Chapters
  • Women’s Programs
  • Identity Chapters
  • Find Your Reunion
  • Career Resources
  • Job Search Resources
  • Career & Life Transitions
  • Programs & Services
  • Career Video Library
  • Alumni Education
  • Research Resources
  • Volunteering
  • Alumni News
  • Class Notes
  • Alumni Voices
  • Contact Alumni Relations
  • Upcoming Events

Admission Events & Information Sessions

  • MBA Program
  • MSx Program
  • PhD Program
  • Alumni Events
  • All Other Events

A Big Data Approach to Public Speaking

Key takeaways from analyzing 100,000 presentations.

April 04, 2016

People in the audience look on as U.S. President Barack Obama participates in an onstage interview | Reuters/Jonathan Ernst

How do you better resonate with your audience? A Stanford GSB lecturer uses big data to explain what works. | Reuters/Jonathan Ernst

Students in my strategic communication class often ask how they can become more engaging, competent communicators. This question is in no way new — rhetoricians dating back to the ancient Greeks have explored this issue. However, unlike Cicero and Aristotle, we now have big data tools and machine-learning techniques to examine the core characteristics of effective communicators.

One person leveraging this technology (and one of my most popular guest lecturers) is Noah Zandan , founder and CEO of Quantified Communications , which offers one of the first analytics platforms to measure, evaluate, and improve corporate executives’ communication skills.

Zandan’s team of data scientists analyzed more than 100,000 presentations from corporate executives, politicians, and keynote speakers. They examined behaviors ranging from word choices and vocal cues to facial expressions and gesture frequency. They then used this data to rate and rank important communication variables such as persuasiveness, confidence, warmth, and clarity.

Zandan grounds his team’s work in a communication scheme created by psychologist Albert Mehrabian. They expand upon Mehrabian’s original “Three V’s” — the verbal, vocal, and visual choices that a communicator makes — by adding a fourth V: the vital elements of communication.

Here’s what his team has learned through studying the best communicators, combined with concepts I cover in class:

VERBAL: Language used in corporate earnings calls impacts up to 2.5% of stock price movement

The actual words you use, whether spoken or written, matter. Zandan and his team found that the language used in corporate earnings calls affects up to 2.5% of stock price movement. Based on data from the most successful communicators, here are three things to keep in mind.

First, word choice should be appropriate for your audience and conform to the context (e.g., formality). Relying on jargon is likely to confuse your audience. The best approach is always to take the time to define terms and technologies that some in your audience might not know. You would also be well-served to have someone review your content specifically to confirm that your word choices are appropriate.

Second, avoid hedging language. Qualifying phrases such as “kind of” and hesitant language like “I think” can be beneficial in interpersonal communication, where they invite contribution and adjust your status relative to the person with whom you are conversing. But in contexts like presenting in public, they can reduce your credibility. You will sound more confident when you remove qualifiers and say “I feel” or “I believe.” The best way to make yourself aware of how often you use hedging language is to have a trusted colleague alert you while giving a practice presentation. Once you’re aware, you will be better able to proactively eliminate this type of language.

Finally, speak clearly and concisely. Research suggests that succinct messages are more memorable. In fact, Zandan and his team found that effective communicators’ messages tend to be more concise than those from speakers who were rated as average or below average. Many presenters speak the way they write — that is, they use complex sentences with nested clauses and phrases. This works well in writing, but when you’re presenting, it’s hard for you to speak and challenging for your audience to follow. In writing, we don’t have to worry about pauses for breath. Nor do we need to worry about the audience understanding what we have written, as a reader can always reread a confusing passage. To be more concise, start by stripping away excess wording that might sound good when read silently but that adds limited value when spoken aloud. When you’re practicing, ask others to paraphrase your points to see if their wording can help you be more succinct.

VOCAL: Even just a 10% increase in vocal variety can have an highly significant impact

Vocal elements include volume, rate, and cadence. The keys to vocal elements are variation and fluency. Think of your voice like a wind instrument. You can make it louder, softer, faster, or slower. We are wired to pay attention to these kinds of vocal change, which is why it is so hard to listen to a monotonous speaker. In fact, even just a 10% increase in vocal variety can have a highly significant impact on your audience’s attention to and retention of your message.

Less expressive speakers should vary their volume and rate by infusing their presentations with emotive words like “excited,” “valuable,” and “challenging,” and using variations in their voice to match the meaning of these words. If you’re speaking about a big opportunity, then say “big” in a big way. With practice, you will feel more comfortable with this type of vocal variety.

Disfluencies — all those “ums” and “uhs” — might be the most difficult vocal element to address. Not all disfluencies are distracting. “Ums” and “uhs” within sentences are not perceived as frequently, nor are they as bothersome, as those that occur between thoughts and phrases. Your audience often skips over midsentence disfluencies because they are more focused on your content than your verbal delivery. But as you move from one point to another, disfluencies stand out because your audience is no longer focused on what you are saying. In essence, you are violating your audience’s expectation of a silent pause by filling it.

To address these between-thought disfluencies, be sure to end your sentences, and especially your major points, on an exhalation. By ending your phrases on a complete exhalation, you necessarily start your next thought with an inhalation. It is nearly impossible to say “um” (or anything, for that matter) while inhaling. A useful way to practice this is to read out loud and notice your breathing patterns. In addition to eliminating between-thought disfluencies, your inhalation brings a pause with it. This unfilled pause has the added benefit of varying your rate.

VISUAL: Educational researchers suggest about 83% of human learning occurs visually.

Visual elements refer to what you do with your body. Zandan cites studies by educational researchers that suggest approximately 83% of human learning occurs visually. Your nonverbal behaviors such as stance, gestures, and eye contact are critical not only for conveying and reinforcing your messages, but they serve as the foundation of your audience’s assessments of your confidence. This is important because your audience equates your competence with their perceptions of your confidence.

Your stance is all about being big and balanced. Stand or sit so that your hips and shoulders are square (i.e., not leaning to one side) and keep your head straight, not tilted. Presenting from a balanced position not only helps you appear more confident, but it actually helps you feel more confident, too. When you make yourself big and balanced, you release neurochemicals that blunt anxiety-producing hormones.

Quote Even just a 10% increase in vocal variety can have a highly significant impact on your audience’s attention to and retention of your message Attribution Matt Abrahams

Gestures need to be broad and extended. When you’re gesturing, go beyond your shoulders rather than in front of your chest, which makes you look small and defensive. When you’re not gesturing, place your arms loosely at your sides or clasp your hands loosely right at your belly button level. Finally, remove any distracting items that you might futz or fiddle with, like jewelry, pens, and slide advancers.

Eye contact is all about connecting to your audience. In North American culture, audiences expect eye contact, and quickly feel ostracized when you fail to look out at them. While you need to spread your eye contact around so that you connect with your entire audience, you need not look at each member individually, especially if you are in front of a large crowd. A good strategy is to create quadrants and look in those various directions. Also, try to avoid repetitive patterns when you scan the room. Finally, as Zandan rightly advises his clients, if you are presenting remotely via video camera, imagine you’re speaking directly to people and look into the camera, not at your monitor or keyboard.

VITALS: Authentic speakers were considered to be 1.3 times more trustworthy and 1.3 times more persuasive.

Vital elements capture a speaker’s true nature — it is what some refer to as authenticity. For authenticity, Zandan’s team has found that the top 10% of authentic speakers were considered to be 1.3 times more trustworthy and 1.3 times more persuasive than the average communicator. Authenticity is made up of the passion and warmth that people have when presenting.

Passion comes from exuding energy and enthusiasm. When you’re preparing and practicing your talk, be sure to reflect back on what excites you about your topic and how your audience will benefit. Reminding yourself of your motivation can help energize you (or reenergize you if it’s a presentation you give over and over again). Additionally, thinking about how you are helping your audience learn, grow, and achieve should ignite your spirits. This energy will manifest itself in how you relay your information. This doesn’t mean you have to be a cheerleader; you need to find a method for relaying your message that is authentic and meaningful for you.

Warmth can be thought of as operationalized empathy. It is a combination of understanding your audience’s needs and displaying that understanding through your words and actions. To be seen as warm, you should acknowledge your audience’s needs by verbally echoing them (e.g., “Like you, I once…”) and by telling stories that convey your understanding of their needs, such as the CEO who tells a story of the most difficult tech support call she had to deal with as she addresses her client services team. Further, maintain an engaged posture by leaning forward and moving toward people who ask questions.

Before your next speech, try out the Four V’s and the specific suggestions derived from big data and machine learning to see if they fit your needs. Only through reflection, practice, and openness to trying new things can you become an engaging, competent communicator.

Matt Abrahams is a Stanford GSB organizational behavior lecturer, author, and communications coach.

For media inquiries, visit the Newsroom .

Explore More

How to chat with bots: the secrets to getting the information you need from ai, you’re in charge: how to use ai as a powerful decision-making tool, is your business ready to jump into a.i. read this first., editor’s picks.

speech on big data analytics

January 29, 2016 How to Manage Your Anxiety When Presenting Do you get nervous speaking in public? Learn how to mitigate your fear.

March 02, 2015 Matt Abrahams: Tips and Techniques for More Confident and Compelling Presentations A Stanford lecturer explains key ways you can better plan, practice, and present your next talk.

  • Priorities for the GSB's Future
  • See the Current DEI Report
  • Supporting Data
  • Research & Insights
  • Share Your Thoughts
  • Search Fund Primer
  • Teaching & Curriculum
  • Affiliated Faculty
  • Faculty Advisors
  • Louis W. Foster Resource Center
  • Defining Social Innovation
  • Impact Compass
  • Global Health Innovation Insights
  • Faculty Affiliates
  • Student Awards & Certificates
  • Changemakers
  • Dean Jonathan Levin
  • Dean Garth Saloner
  • Dean Robert Joss
  • Dean Michael Spence
  • Dean Robert Jaedicke
  • Dean Rene McPherson
  • Dean Arjay Miller
  • Dean Ernest Arbuckle
  • Dean Jacob Hugh Jackson
  • Dean Willard Hotchkiss
  • Faculty in Memoriam
  • Stanford GSB Firsts
  • Certificate & Award Recipients
  • Teaching Approach
  • Analysis and Measurement of Impact
  • The Corporate Entrepreneur: Startup in a Grown-Up Enterprise
  • Data-Driven Impact
  • Designing Experiments for Impact
  • Digital Business Transformation
  • The Founder’s Right Hand
  • Marketing for Measurable Change
  • Product Management
  • Public Policy Lab: Financial Challenges Facing US Cities
  • Public Policy Lab: Homelessness in California
  • Lab Features
  • Curricular Integration
  • View From The Top
  • Formation of New Ventures
  • Managing Growing Enterprises
  • Startup Garage
  • Explore Beyond the Classroom
  • Stanford Venture Studio
  • Summer Program
  • Workshops & Events
  • The Five Lenses of Entrepreneurship
  • Leadership Labs
  • Executive Challenge
  • Arbuckle Leadership Fellows Program
  • Selection Process
  • Training Schedule
  • Time Commitment
  • Learning Expectations
  • Post-Training Opportunities
  • Who Should Apply
  • Introductory T-Groups
  • Leadership for Society Program
  • Certificate
  • 2024 Awardees
  • 2023 Awardees
  • 2022 Awardees
  • 2021 Awardees
  • 2020 Awardees
  • 2019 Awardees
  • 2018 Awardees
  • Social Management Immersion Fund
  • Stanford Impact Founder Fellowships and Prizes
  • Stanford Impact Leader Prizes
  • Social Entrepreneurship
  • Stanford GSB Impact Fund
  • Economic Development
  • Energy & Environment
  • Stanford GSB Residences
  • Environmental Leadership
  • Stanford GSB Artwork
  • A Closer Look
  • California & the Bay Area
  • Voices of Stanford GSB
  • Business & Beneficial Technology
  • Business & Sustainability
  • Business & Free Markets
  • Business, Government, and Society Forum
  • Get Involved
  • Second Year
  • Global Experiences
  • JD/MBA Joint Degree
  • MA Education/MBA Joint Degree
  • MD/MBA Dual Degree
  • MPP/MBA Joint Degree
  • MS Computer Science/MBA Joint Degree
  • MS Electrical Engineering/MBA Joint Degree
  • MS Environment and Resources (E-IPER)/MBA Joint Degree
  • Academic Calendar
  • Clubs & Activities
  • LGBTQ+ Students
  • Military Veterans
  • Minorities & People of Color
  • Partners & Families
  • Students with Disabilities
  • Student Support
  • Residential Life
  • Student Voices
  • MBA Alumni Voices
  • A Week in the Life
  • Career Support
  • Employment Outcomes
  • Cost of Attendance
  • Knight-Hennessy Scholars Program
  • Yellow Ribbon Program
  • BOLD Fellows Fund
  • Application Process
  • Loan Forgiveness
  • Contact the Financial Aid Office
  • Evaluation Criteria
  • GMAT & GRE
  • English Language Proficiency
  • Personal Information, Activities & Awards
  • Professional Experience
  • Letters of Recommendation
  • Optional Short Answer Questions
  • Application Fee
  • Reapplication
  • Deferred Enrollment
  • Joint & Dual Degrees
  • Entering Class Profile
  • Event Schedule
  • Ambassadors
  • New & Noteworthy
  • Ask a Question
  • See Why Stanford MSx
  • Is MSx Right for You?
  • MSx Stories
  • Leadership Development
  • Career Advancement
  • Career Change
  • How You Will Learn
  • Admission Events
  • Personal Information
  • Information for Recommenders
  • GMAT, GRE & EA
  • English Proficiency Tests
  • After You’re Admitted
  • Daycare, Schools & Camps
  • U.S. Citizens and Permanent Residents
  • Requirements
  • Requirements: Behavioral
  • Requirements: Quantitative
  • Requirements: Macro
  • Requirements: Micro
  • Annual Evaluations
  • Field Examination
  • Research Activities
  • Research Papers
  • Dissertation
  • Oral Examination
  • Current Students
  • Education & CV
  • International Applicants
  • Statement of Purpose
  • Reapplicants
  • Application Fee Waiver
  • Deadline & Decisions
  • Job Market Candidates
  • Academic Placements
  • Stay in Touch
  • Faculty Mentors
  • Current Fellows
  • Standard Track
  • Fellowship & Benefits
  • Group Enrollment
  • Program Formats
  • Developing a Program
  • Diversity & Inclusion
  • Strategic Transformation
  • Program Experience
  • Contact Client Services
  • Campus Experience
  • Live Online Experience
  • Silicon Valley & Bay Area
  • Digital Credentials
  • Faculty Spotlights
  • Participant Spotlights
  • Eligibility
  • International Participants
  • Stanford Ignite
  • Frequently Asked Questions
  • Operations, Information & Technology
  • Organizational Behavior
  • Political Economy
  • Classical Liberalism
  • The Eddie Lunch
  • Accounting Summer Camp
  • Videos, Code & Data
  • California Econometrics Conference
  • California Quantitative Marketing PhD Conference
  • California School Conference
  • China India Insights Conference
  • Homo economicus, Evolving
  • Political Economics (2023–24)
  • Scaling Geologic Storage of CO2 (2023–24)
  • A Resilient Pacific: Building Connections, Envisioning Solutions
  • Adaptation and Innovation
  • Changing Climate
  • Civil Society
  • Climate Impact Summit
  • Climate Science
  • Corporate Carbon Disclosures
  • Earth’s Seafloor
  • Environmental Justice
  • Operations and Information Technology
  • Organizations
  • Sustainability Reporting and Control
  • Taking the Pulse of the Planet
  • Urban Infrastructure
  • Watershed Restoration
  • Junior Faculty Workshop on Financial Regulation and Banking
  • Ken Singleton Celebration
  • Marketing Camp
  • Quantitative Marketing PhD Alumni Conference
  • Presentations
  • Theory and Inference in Accounting Research
  • Stanford Closer Look Series
  • Quick Guides
  • Core Concepts
  • Journal Articles
  • Glossary of Terms
  • Faculty & Staff
  • Researchers & Students
  • Research Approach
  • Charitable Giving
  • Financial Health
  • Government Services
  • Workers & Careers
  • Short Course
  • Adaptive & Iterative Experimentation
  • Incentive Design
  • Social Sciences & Behavioral Nudges
  • Bandit Experiment Application
  • Conferences & Events
  • Reading Materials
  • Energy Entrepreneurship
  • Faculty & Affiliates
  • SOLE Report
  • Responsible Supply Chains
  • Current Study Usage
  • Pre-Registration Information
  • Participate in a Study
  • Founding Donors
  • Location Information
  • Participant Profile
  • Network Membership
  • Program Impact
  • Collaborators
  • Entrepreneur Profiles
  • Company Spotlights
  • Seed Transformation Network
  • Responsibilities
  • Current Coaches
  • How to Apply
  • Meet the Consultants
  • Meet the Interns
  • Intern Profiles
  • Collaborate
  • Research Library
  • News & Insights
  • Program Contacts
  • Databases & Datasets
  • Research Guides
  • Consultations
  • Research Workshops
  • Career Research
  • Research Data Services
  • Course Reserves
  • Course Research Guides
  • Material Loan Periods
  • Fines & Other Charges
  • Document Delivery
  • Interlibrary Loan
  • Equipment Checkout
  • Print & Scan
  • MBA & MSx Students
  • PhD Students
  • Other Stanford Students
  • Faculty Assistants
  • Research Assistants
  • Stanford GSB Alumni
  • Telling Our Story
  • Staff Directory
  • Site Registration
  • Alumni Directory
  • Alumni Email
  • Privacy Settings & My Profile
  • Success Stories
  • The Story of Circles
  • Support Women’s Circles
  • Stanford Women on Boards Initiative
  • Alumnae Spotlights
  • Insights & Research
  • Industry & Professional
  • Entrepreneurial Commitment Group
  • Recent Alumni
  • Half-Century Club
  • Fall Reunions
  • Spring Reunions
  • MBA 25th Reunion
  • Half-Century Club Reunion
  • Faculty Lectures
  • Ernest C. Arbuckle Award
  • Alison Elliott Exceptional Achievement Award
  • ENCORE Award
  • Excellence in Leadership Award
  • John W. Gardner Volunteer Leadership Award
  • Robert K. Jaedicke Faculty Award
  • Jack McDonald Military Service Appreciation Award
  • Jerry I. Porras Latino Leadership Award
  • Tapestry Award
  • Student & Alumni Events
  • Executive Recruiters
  • Interviewing
  • Land the Perfect Job with LinkedIn
  • Negotiating
  • Elevator Pitch
  • Email Best Practices
  • Resumes & Cover Letters
  • Self-Assessment
  • Whitney Birdwell Ball
  • Margaret Brooks
  • Bryn Panee Burkhart
  • Margaret Chan
  • Ricki Frankel
  • Peter Gandolfo
  • Cindy W. Greig
  • Natalie Guillen
  • Carly Janson
  • Sloan Klein
  • Sherri Appel Lassila
  • Stuart Meyer
  • Tanisha Parrish
  • Virginia Roberson
  • Philippe Taieb
  • Michael Takagawa
  • Terra Winston
  • Johanna Wise
  • Debbie Wolter
  • Rebecca Zucker
  • Complimentary Coaching
  • Changing Careers
  • Work-Life Integration
  • Career Breaks
  • Flexible Work
  • Encore Careers
  • Join a Board
  • D&B Hoovers
  • Data Axle (ReferenceUSA)
  • EBSCO Business Source
  • Global Newsstream
  • Market Share Reporter
  • ProQuest One Business
  • Student Clubs
  • Entrepreneurial Students
  • Stanford GSB Trust
  • Alumni Community
  • How to Volunteer
  • Springboard Sessions
  • Consulting Projects
  • 2020 – 2029
  • 2010 – 2019
  • 2000 – 2009
  • 1990 – 1999
  • 1980 – 1989
  • 1970 – 1979
  • 1960 – 1969
  • 1950 – 1959
  • 1940 – 1949
  • Service Areas
  • ACT History
  • ACT Awards Celebration
  • ACT Governance Structure
  • Building Leadership for ACT
  • Individual Leadership Positions
  • Leadership Role Overview
  • Purpose of the ACT Management Board
  • Contact ACT
  • Business & Nonprofit Communities
  • Reunion Volunteers
  • Ways to Give
  • Fiscal Year Report
  • Business School Fund Leadership Council
  • Planned Giving Options
  • Planned Giving Benefits
  • Planned Gifts and Reunions
  • Legacy Partners
  • Giving News & Stories
  • Giving Deadlines
  • Development Staff
  • Submit Class Notes
  • Class Secretaries
  • Board of Directors
  • Health Care
  • Sustainability
  • Class Takeaways
  • All Else Equal: Making Better Decisions
  • If/Then: Business, Leadership, Society
  • Grit & Growth
  • Think Fast, Talk Smart
  • Spring 2022
  • Spring 2021
  • Autumn 2020
  • Summer 2020
  • Winter 2020
  • In the Media
  • For Journalists
  • DCI Fellows
  • Other Auditors
  • Academic Calendar & Deadlines
  • Course Materials
  • Entrepreneurial Resources
  • Campus Drive Grove
  • Campus Drive Lawn
  • CEMEX Auditorium
  • King Community Court
  • Seawell Family Boardroom
  • Stanford GSB Bowl
  • Stanford Investors Common
  • Town Square
  • Vidalakis Courtyard
  • Vidalakis Dining Hall
  • Catering Services
  • Policies & Guidelines
  • Reservations
  • Contact Faculty Recruiting
  • Lecturer Positions
  • Postdoctoral Positions
  • Accommodations
  • CMC-Managed Interviews
  • Recruiter-Managed Interviews
  • Virtual Interviews
  • Campus & Virtual
  • Search for Candidates
  • Think Globally
  • Recruiting Calendar
  • Recruiting Policies
  • Full-Time Employment
  • Summer Employment
  • Entrepreneurial Summer Program
  • Global Management Immersion Experience
  • Social-Purpose Summer Internships
  • Process Overview
  • Project Types
  • Client Eligibility Criteria
  • Client Screening
  • ACT Leadership
  • Social Innovation & Nonprofit Management Resources
  • Develop Your Organization’s Talent
  • Centers & Initiatives
  • Student Fellowships

The hidden value of voice conversations: Part 1, Trends and technologies

Customers have so many ways to engage with companies, it may be surprising that voice conversations remain a popular choice. Recent technologies such as voice data analytics are allowing companies to use these personal interactions as a new driver of insight.

In this episode of McKinsey Talks Operations —the first in a series of two—host Daphne Luchtenberg is joined by Paul Humphrey, CEO and founder of Call Journey, a global thought leader in conversation intelligence and speech analytics, and Eric Buesing, a partner at McKinsey and a leader of McKinsey’s customer care offering. Their conversation has been edited for clarity.

Daphne Luchtenberg: Your company’s future success demands agile, flexible, and resilient operations. I’m your host Daphne Luchtenberg, and you’re listening to McKinsey Talks Operations , a podcast where the world’s C-suite leaders and McKinsey experts cut through the noise and uncover how to create a new operational reality.

Positive customer experiences are driven by high-quality, personalized interactions. And with the rise in bots and all things digital, it could be a fair assumption that businesses would be looking to slim down their call center operations, moving away from human voice interactions. Yet this has not happened, as many organizations are now taking a new look at the value that this personal interface can bring, less as a cost driver and more as an opportunity to provide strategic experience-oriented customer insight. Thanks to advances in technology, we’re now seeing a core analytics use case emerging for voice data analytics. But many businesses still struggle to capture and process the voice conversations they have with their customers in a way that drives real, measurable, bottom-line impact.

This is such a broad topic that we’re going to cover it over two episodes. The first episode will explore the wider trends related to voice interactions in customer care and the technologies that are available. The second episode will explore use cases and best practices for implementing the new analytics tools. I’m joined by Paul Humphrey, CEO and founder of Call Journey, a global thought leader in conversation intelligence and speech analytics, with more than 30 years of experience across multiple industries; and Eric Buesing, a partner at McKinsey and a leader of our Customer Care service line.

Eric and Paul, great to have you here. I’d like to open up our conversation today by exploring why, despite the growth of digital, voice remains a dominant channel.

Subscribe to the McKinsey Talks Operations podcast

Paul Humphrey: I think what’s happening is that people are looking for more of that personal touch; they want to talk with real people. And they want to communicate and converse with people who feel and hear and empathize. Post COVID, we’re looking for more of an emotional connection. In fact, a 2021 Harvard study found that—if I remem­ber it correctly—36 percent of all Americans, including around 60 percent of young adults and about 50 percent of mothers with young children, felt serious loneliness. 1 Milena Batanova, Virginia Lovison, Eric Torres, and Richard Weissbourd, Loneliness in America: How the pandemic has deepened an epidemic of loneliness and what we can do about it , Harvard Graduate School of Education and Making Caring Common Project, February 2021. That’s an interesting one from an engagement perspective. In fact, Brand Keys, which is a New York–based brand loyalty and engagement research consultancy, does annual behavior assessments and emotional engagement metrics specifically around customer loyalty etcetera, and they deliver an index to identify the trends.

An index in another study found that customer foundations for brand engagement, product and service purchase, and brand loyalty will almost entirely be emotionally based now.

The index for 2022 goes on to show cross-category decision-making ratios of about 80 percent emotional, 20 percent rational for 2022. The rational aspect for categories and sectors will get filled by customers under what they [the index owners] call primacy of product and service, but brands will need to know the emotional-to-rational ratio for values that drive consumer behavior in those categories. So digital interactions and that utilization clearly drive a better bottom line for company performance, particularly cost to serve. But if you do look deeper at customer journeys, overall engagement, sentiment, emotional engagement with organizations and brands, the smarter organizations we see will know how to engage, who to engage, where and how they engage those people, with ongoing customer interaction intelligence driving that evolution.

Customer foundations for brand engagement, product and service purchase, and brand loyalty will almost entirely be emotionally based now. Paul Humphrey

That’s all about emotional connection, which is what is lacking in the digital world and why we’re seeing that growth. So emotional connection is the new-age CX [customer experience] key.

Daphne Luchtenberg: That’s a nice way of putting it. And Eric, from your perspective, are you seeing that evenly distributed across all industries? Or are some industries leading the way?

Eric Buesing: It probably does vary by industry. I think there was a perception five or ten years ago that the traditional call center was becoming extinct. To your point, a lot of industries and organizations underestimated how the growth of digital would impact voice and/or maybe how quickly volumes might be replaced. I think instead, the importance of speaking with an informed representative of the organization, as Paul mentioned—that need for human and emotional connection in what we call the moments that really matter to the customer—has actually increased. So yes, the share of digital will probably grow as a percent of total interactions, but total volumes are going to continue to rise. And the complexity of those interactions is also increasing, which impacts handle times.

The question of what’s contributing to this trend—I think there might be a couple. It could be structural. For example, a poor digital interaction is always going to result in a call. Just think about the last time you tried to do something online and you got frustrated. The first thing you want to do is pick up the phone and potentially tell somebody off. So bad digital experiences drive live interactions—that’s the first thing. The other thing to recognize is that more people are transacting digitally and attempting to do more complex things. If more people are doing complex things, then that requires potentially a partial live interaction along with it.

Daphne Luchtenberg: Paul, anything to add there?

Paul Humphrey: No, I was going to mirror what Eric was saying. In fact, there are a couple of really good examples of financial-services organizations that we’ve been working with that are reporting exactly what Eric was mentioning. So if the ratio is staying fairly similar in terms of offline and online interactions, then the more digital, the more growth, the more exposure, the more things that people are exposed to, the more volume they’re going to continue to get by just having that natural growth of interactions. And as Eric was saying, it’s hard to digitize everything; it’s hard to make that easy interaction for customers via digital as simple as possible.

Would you like to learn more about McKinsey’s Operations Practice ?

Daphne Luchtenberg: Yes, and that completely changes the purpose and the function of the contact center from delivering a basic service initially, which is what they were intended to do, now to a much more strategic generator of value and potentially also a powerful differentiator, I assume. And Eric, what have you seen organizations doing well? What are some of the steps to becoming a leader in this space?

Eric Buesing: Organizations that aspire for the servicing function of their contact center to really provide an exceptional experience—and by that, I mean they invest in ways to delight customers and really build loyalty versus organizations that view the channel as the last resort, a pure cost center—I think that’s a differentiating factor. And even to push on that, organizations that recognize that the data and intelligence coming out of the contact center are incredibly valuable in other parts of the organization—I’d also characterize that as truly differentiating. For example, in financial services, we’re seeing organizations putting real investment in capacity and rethinking the strategic value of what a contact center is. They’re rethinking their talent, potentially hiring differently. They’re intro­ducing new knowledge systems and technology, all with the aim of increasing the capabilities and the productivity of employees.

Organizations that recognize that the data and intelligence coming out of the contact center are incredibly valuable in other parts of the organization—I’d characterize that as truly differentiating. Eric Buesing

Airlines are also a great example of linking customer data from contact centers and live interactions together to create a much more precise and even predictive view on the customer experience. In other words, they don’t need to wait for an NPS [net promoter score] survey to come back to know when a passenger has had a really bad day. They know their flight was delayed, and they missed a connection, and when they called, they waited for 45 minutes, and then their luggage got lost, and they missed a meeting. They can actually verify that through data. And if you can find that out earlier, you can be much more personalized or even faster in how you respond.

Daphne Luchtenberg: I love that. And that all comes back to listening and the listening skills around not just what are your customers saying but also how are they saying it. I know that a lot of us have seen advanced thinking around human speech and developments in neural-network language models, helping to overcome some of the legacy problems related to difficulties in extracting and using call center data. Paul, can you talk a little about what’s the latest and greatest now in terms of natural-language processing technology?

Paul Humphrey: I think, Daphne, it’s best to first define NLP. Some people think it’s not natural-language processing but another version of NLP, neuro-linguistic programming. But natural-language processing, if I give it a basic definition for context, is the art of utilizing computer science and AI to under­stand and analyze interactions involving computers and humans or human language. So NLP in our world is utilized to process and analyze huge amounts of conversation or conversation transcripts. It looks to understand context, to understand contextual nuances within the language and interactions, which then form actual conversations. NLP is getting toward what our data science folks might call computationally efficient. Historically, delivering NLP solutions has meant a ton of CPU and computing to make it work, which is super expensive and very rigid in what it can do.

Now the new kid on the block arrives, and that new kid is NLU, which is natural-language understanding. Whereas NLP uses algorithms that try to under­stand language and conversation, NLU looks to understand context. With that smarter technology and computing that we’re talking about now, a combination of NLP and NLU is much quicker and is a better utilization of the good old term “fuzzy logic.” So we’ve gone from NLP, which is kind of “look for these three phrases and see if I get a hit,” to now skipping words that don’t make sense, inferring context, and giving a holistic view of the nuances of a conversation—not just a couple of people talking at each other but a couple of people actually conversing with each other.

For example, if we look at how NLP and NLU are being used at the moment in contact centers, which is a key topic of the discussion, there’s a big demand for NLP and NLU for call summarization. There’s a huge need for that in the world of contact centers. In fact, we did some work recently for a quite large bank, who found that 63 percent of their disposition codes were incorrect. A disposition code is where an agent summarizes what the call was actually about when they’re wrapping up the call. This means that they got the reason for the call—why they [customers] rang or why they contacted the bank—wrong, and they got it wrong 63 percent of the time. That meant that a huge amount of call-reason data being fed to the marketing folks of the bank was wildly inaccurate. And you can imagine the costs of the incorrect data insights to a big bank. That is massive— a huge impact.

This lines up with best practice for organizations to use effective NLP and NLU tools and combine them with machine learning, which is the contact center, and that’s the best place to utilize that type of technology. And it’s not just making the contact center more efficient. In the contact center are massive amounts of rich conversations and inter­actions that happen with customers every day. In fact, if we look across the US, Australia, and the UK, it’s about 16 billion minutes of contact center conversations and conversations with people every month. So there are huge opportunities being missed if you’re not tapping into that properly.

The other point I think you were talking about earlier, Daphne—the difference between good and great systems when talking about the NLP and NLU—is the utilization and mix of NLP and NLU and combining that with machine learning. I would caution, though, that the view is not just a silo view of having that combination of tools, because you could have great tools, but if you engage a bad carpenter, then the house you want to build won’t be what it should be. So best-of-breed technology and best-of-breed implementation. For example, in our world, 67 percent of speech analytics solutions fail. Some of that’s because of technology capability, and some of that’s because of delivery capability, organizational capacity, and expertise. So you really do need to mix all of those and have the right expertise and thought leadership wrapped around what you’re doing in that world of NLP, NLU, and machine learning.

In the contact center are massive amounts of rich conversations and interactions that happen with customers every day. . . . There are huge opportunities being missed if you’re not tapping into that properly. Paul Humphrey

Daphne Luchtenberg: Thanks, Paul. So basically, we need to think about NLP, we need to think about NLU, and then there’s the whole element of machine learning that brings this together in a framework that answers the right questions, and that has the folks in the team who know exactly how to analyze and then apply some of the insights that come from that. Eric, you talk to clients every day. Where have you seen clients adopt and embrace this new way of working to enhance what they’re doing?

Eric Buesing: I think we’re seeing new opportunities for impacts being dreamed up and unveiled at our clients constantly. The challenge and the problem that they’re solving are the same, and that’s that the amount of data available or becoming available is unwieldy and overwhelming. Organizations need tools like NLP, which has been around for a while, and NLU, which is really the understanding to help navigate that, not to replace the decision but actually to make a better human decision.

Where we are seeing it, it can be some even founda­tional basics done well. So sharpening visibility into classic contact center operations, like average handle time: What’s actually happening when agents are speaking? If they’re putting customers on hold, what’s happening during that time? What triggered that? If I was able to resolve the issue in that interaction, which sometimes is referred to as first-contact resolution [FCR], why did that happen—not just the number of FCRs but the actual understanding of what led to resolving the issue the first time?

As Paul mentioned, root cause analysis has been an idea that every organization has been trying to understand for a while, but one of the challenges is customers don’t always call in for one reason. Daphne, you might call in because you’re checking your balance, but by the way, there are three other things you wanted to do while you had that person on the phone, because good luck trying to get them on the phone again. So oftentimes there are multiple things. We call that multiple intent. NLU can uncover pairings. When customers call about one topic, they’re very likely to want to ask about another, which allows a more personalized or predictive interaction in the future.

" "

The hidden value of voice conversations: Part 2, Reaping the rewards

That’s the basics, but there are other interesting things that NLU is unlocking on the agent side and from customer cues. So, for example, some organizations might ask themselves, “What are my best people saying? How do they navigate difficult interactions? How do they delight and surprise customers? And how’s that different from an average performer? Can I really get insight into that, so that I can train better, adjust how I upskill or how I retrain or how I coach people?” That’s new and unique.

In areas like compliance, NLP and NLU are helping organizations move from manual sampling, which is literally somebody listening in randomly—you hear that all the time: “This call might be recorded for quality assurance”—moving into AI or machine learning listening to every interaction. In regulated industries like medtech, a sales environment, or in financial services, it’s also providing more assurance that we’re doing right by the customer.

I would be remiss if I said that we’re not looking at employees themselves. A big challenge right now is supporting the employees in an organization better as well. And sometimes we don’t know are people happy or not happy, are they having a good day or a bad day, are there ways that I can create a better experience for them. Some­times NLP and NLU also give insight into the mindset of the employee and help the organization or help their direct supervisor or managers create better interactions and hopefully a better experience for everybody.

Daphne Luchtenberg: That’s really clear and so exciting in terms of the opportunities. It seems like we’ve only just started to touch on where this can take us.

Now that we’ve explored the technology and the insights it can give us, I’d like to wrap things up for this episode. I hope our audience will join us again for the next episode, where we will take some time to understand the implementation of these tools and how to capture the value. You’ve been listening to McKinsey Talks Operations with me, Daphne Luchtenberg. If you like what you’ve heard, subscribe to our show on Apple Podcasts, Spotify, or wherever you listen.

Paul Humphrey is CEO and founder of Call Journey. Eric Buesing is a partner in McKinsey’s Stamford office, and Daphne Luchtenberg is director of reach and engagement in the London office.

Comments and opinions expressed by interviewees are their own and do not represent or reflect the opinions, policies, or positions of McKinsey & Company or have its endorsement.

Explore a career with us

Related articles.

""

From speech to insights: The value of the human voice

Customer care center

The state of customer care in 2022

Topics Data Analytics

Data Analytics

Discover the latest collection of talks and videos on Data Analytics from industry experts.

Advertisement

Advertisement

English language teaching based on big data analytics in augmentative and alternative communication system

  • Published: 08 February 2022
  • Volume 25 , pages 409–420, ( 2022 )

Cite this article

speech on big data analytics

  • Ran Qian 1 ,
  • Sudhakar Sengan 2 &
  • Sapna Juneja 3  

653 Accesses

10 Citations

Explore all metrics

The tremendous growth in the education sector has given rise to several developments focused on teaching and training. The Augmentative and Alternative Communication (AAC) method has helped people with neurological disabilities to learn for years. AAC faces significant challenges that affect the level of language learning skills, mainly English. Artificial intelligence on AAC mechanism for leveling the English language because it trains to have dataset processing. The system processing trains and tests datasets by intelligent thought values to enough output of English communication. In this paper, Big Data Integrated Artificial Intelligence for AAC (BDIAI-AAC) has been proposed to train people in English with neural disorders. Here, BDIAI-AAC is speech recognition trained with a network of animated videos. Artificial Intelligence (AI) trained network works on three layers. The input layer is the speech recognition model, which converts speech to string said by the educator. The hidden layer processes the string data as well as matches with the corresponding video animation. Artificial intelligence works on three layers of conversions like input layer, hidden layer, and output layer. The hidden layer verifies to produce a matched string as predefined dataset values on video animation. The hidden process comprises image processing, recurrent networks, and memory unit for storing data. Finally, the Output layer displays the animated video along with a sentence using AAC. Thus, English sentences are converted into respective videos or animations using AI-trained networks and AAC models. The comparative analysis of the proposed method BDIAI-AAC with technological advancements has shown that the method reaches 98.01% of word recognition rate and 97.89% of prediction rate, high efficiency (95.34%), performance (96.45%), accuracy (95.14%), stimulus (94.2%), disorder identification rate (91.12%) when compared to other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

speech on big data analytics

Similar content being viewed by others

speech on big data analytics

A bilingual speech neuroprosthesis driven by cortical articulatory representations shared between languages

speech on big data analytics

A comprehensive survey on automatic speech recognition using neural networks

speech on big data analytics

AI Chatbots Learning Model in English Speaking Skill: Alleviating Speaking Anxiety, Boosting Enjoyment, and Fostering Critical Thinking

Abrahamson, D., Flood, V. J., Miele, J. A., & Siu, Y. T. (2019). Enactivism and ethnomethodological conversation analysis as tools for expanding Universal Design for Learning: The case of visually impaired mathematics students. ZDM, 51 (2), 291–303.

Article   Google Scholar  

Aldabas, R. (2019). Barriers and facilitators of using augmentative and alternative communication with students with multiple disabilities in inclusive education: Special education teachers’ perspectives. International Journal of Inclusive Education, 1 , 1–17.

Google Scholar  

Amudha, G. (2021). Dilated transaction access and retrieval: Improving the information retrieval of blockchain-assimilated internet of things transactions. Wireless Personal Communications, 20 , 1–21.

Amudha, G., Jayasri, T., Saipriya, K., Shivani, A., & Praneetha, C. H. (2019). Behavioural based online comment spammers in social media. International Journal of Innovative Technology and Exploring Engineering , 9 , 175–179.

Caron, J., Light, J., & McNaughton, D. (2021). Effects of a literacy feature in an augmentative and alternative communication app on single-word reading of individuals with severe autism spectrum disorders. Research and Practice for Persons with Severe Disabilities, 46 (1), 18–34.

Chan, R. Y. Y., Sato-Shimokawara, E., Bai, X., Yukiharu, M., Kuo, S. W., & Chung, A. (2019). A context-aware augmentative and alternative communication system for school children with intellectual disabilities. IEEE Systems Journal, 14 (1), 208–219.

Dash, R. K., Nguyen, T. N., Cengiz, K., & Sharma, A. (2021). Fine-tuned support vector regression model for stock predictions. Neural Computing and Applications, 12 , 1–15.

Gao, J., Wang, H., & Shen, H. (2020). Machine learning-based workload prediction in cloud computing. In  2020 29th international conference on computer communications and networks (ICCCN)  (pp. 1–9). IEEE.

Gheisari, M., Najafabadi, H. E., Alzubi, J. A., Gao, J., Wang, G., Abbasi, A. A., & Castiglione, A. (2021). OBPP: An ontology-based framework for privacy-preserving in IoT-based smart city. Future Generation Computer Systems, 123 , 1–13.

Hu, L., Nguyen, N. T., Tao, W., Leu, M. C., Liu, X. F., Shahriar, M. R., & Al Sunny, S. N. (2018). Modeling of cloud-based digital twins for smart manufacturing with MT connect. Procedia Manufacturing, 26 , 1193–1203.

Huifeng, W., Kadry, S. N., & Raj, E. D. (2020). Continuous health monitoring of sportsperson using IoT devices based on wearable technology. Computer Communications, 160 , 588–595.

Joginder Singh, S., Diong, Z. Z., & Mustaffa Kamal, R. (2020). Malaysian teachers’ experience using augmentative and alternative communication with students. Augmentative and Alternative Communication, 36 (2), 107–117.

Khamparia, A., Singh, S. K., Luhach, A. K., & Gao, X. Z. (2020). Classification and analysis of users review using different classification techniques in an intelligent e-learning system. International Journal of Intelligent Information and Database Systems, 13 (2–4), 139–149.

Kumar, H., Singh, M. K., Gupta, M. P., & Madaan, J. (2020). Moving towards smart cities: Solutions that lead to the Smart City Transformation Framework. Technological Forecasting and Social Change, 153 , 119281.

Kurilovas, E., & Kubilinskiene, S. (2020). Lithuanian case study on evaluating suitability, acceptance, and use of IT tools by students: An example of applying Technology Enhanced Learning Research methods in Higher Education. Computers in Human Behavior, 107 , 106274.

Kuthadi, V. M., Selvaraj, R., Baskar, S., Shakeel, P. M., & Ranjan, A. (2021). Optimized energy management model on data distributing framework of wireless sensor network in IoT system. Wireless Personal Communications, . https://doi.org/10.1007/s11277-021-08583-0

Light, J., McNaughton, D., & Caron, J. (2019). New and emerging AAC technology supports children with complex communication needs and communication partners: State of the science and future research directions. Augmentative and Alternative Communication, 35 (1), 26–41.

Manogaran, G., Shakeel, P. M., Baskar, S., Hsu, C. H., Kadry, S. N., Sundarasekar, R., ... & Muthu, B. A. (2020). FDM: Fuzzy-optimized data management technique for improving big data analytics. IEEE Transactions on Fuzzy Systems , 29(1), 177–185

Meinzen-Derr, J., Sheldon, R. M., Henry, S., Grether, S. M., Smith, L. E., Mays, L., ... & Wiley, S. (2019). Enhancing language in children who are deaf/hard-of-hearing using augmentative and alternative communication technology strategies.  International Journal of Pediatric Otorhinolaryngology , 125, 23–31.

Mngomezulu, J., Tönsing, K. M., Dada, S., & Bokaba, N. B. (2019). Determining a Zulu core vocabulary for children who use augmentative and alternative communication. Augmentative and Alternative Communication, 35 (4), 274–284.

Nguyen, G. N., Le Viet, N. H., Elhoseny, M., Shankar, K., Gupta, B. B., & Abd El-Latif, A. A. (2021). Secure blockchain enabled Cyber-physical systems in healthcare using deep belief network with ResNet model. Journal of Parallel and Distributed Computing, 153 , 150–160.

Rodríguez, A. O. R., Riaño, M. A., García, P. A. G., Marín, C. E. M., Crespo, R. G., & Wu, X. (2020). Emotional characterization of children through a learning environment using learning analytics and AR-Sandbox. Journal of Ambient Intelligence and Humanized Computing, 41 , 1–15.

Samuel, S. R., Acharya, S., & Rao, J. C. (2020). School Interventions–based Prevention of Early-Childhood Caries among 3–5-year-old children from very low socioeconomic status: Two-year randomized trial. Journal of Public Health Dentistry, 80 (1), 51–60.

Sedik, A., Iliyasu, A. M., El-Rahiem, A., Abdel Samea, M. E., Abdel-Raheem, A., Hammad, M., ... & Ahmed, A. (2020). Deploying machine and deep learning models for efficient data-augmented detection of COVID-19 infections.  Viruses ,  12 (7), 769

Soh, P. Y., Heng, H. B., Selvachandran, G., Chau, H. T. M., Abdel-Baset, M., Manogaran, G., & Varatharajan, R. (2020). Perception, acceptance, and willingness of older adults in Malaysia towards online shopping: a study using the UTAUT and IRT models. Journal of Ambient Intelligence and Humanized Computing, 14 , 1–13.

Su, H., Chang, Y. K., Lin, Y. J., & Chu, I. H. (2015). Effects of training using an active video game on agility and balance. The Journal of Sports Medicine and Physical Fitness, 55 (9), 914–921.

Tönsing, K. M., Van Niekerk, K., Schlünz, G., & Wilken, I. (2019). Multilingualism and augmentative and alternative communication in South Africa-Exploring the views of persons with complex communication needs. African Journal of Disability (online), 8 , 1–13.

Zhou, L., Zhang, Y., Ge, Y., Zhu, X., & Pan, J. (2020). Regulatory mechanisms and promising applications of quorum sensing-inhibiting agents in control of bacterial biofilm formation. Frontiers in Microbiology, 11 , 2558.

Download references

Acknowledgements

The article is the research result of the project “The Study of Computer-assistant Translation in English Teaching” (No. 2019MS152), supported by the Fundamental Research Funds for the Central Universities.

Author information

Authors and affiliations.

Science and Technology College, North China Electric Power University, Hebei, Baoding, 071000, China

Department of Computer Science and Engineering, PSN College of Engineering and Technology, Tamil Nadu, Tirunelveli, 627 152, India

Sudhakar Sengan

Department of CSE, IITM Group of Institutions, Sonipat, Haryana, India

Sapna Juneja

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ran Qian .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Qian, R., Sengan, S. & Juneja, S. English language teaching based on big data analytics in augmentative and alternative communication system. Int J Speech Technol 25 , 409–420 (2022). https://doi.org/10.1007/s10772-022-09960-1

Download citation

Received : 20 March 2021

Accepted : 31 December 2021

Published : 08 February 2022

Issue Date : June 2022

DOI : https://doi.org/10.1007/s10772-022-09960-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Big data analytics
  • Artificial intelligence
  • Speech recognization model
  • Augmentative and alternative communication
  • Neurological disorder
  • Find a journal
  • Publish with us
  • Track your research

TechRepublic

Male system administrator of big data center typing on laptop computer while working in server room. Programming digital operation. Man engineer working online in database center. Telecommunication.

8 Best Data Science Tools and Software

Apache Spark and Hadoop, Microsoft Power BI, Jupyter Notebook and Alteryx are among the top data science tools for finding business insights. Compare their features, pros and cons.

AI act trilogue press conference.

EU’s AI Act: Europe’s New Rules for Artificial Intelligence

Europe's AI legislation, adopted March 13, attempts to strike a tricky balance between promoting innovation and protecting citizens' rights.

Concept image of a woman analyzing data.

10 Best Predictive Analytics Tools and Software for 2024

Tableau, TIBCO Data Science, IBM and Sisense are among the best software for predictive analytics. Explore their features, pricing, pros and cons to find the best option for your organization.

Tableau logo.

Tableau Review: Features, Pricing, Pros and Cons

Tableau has three pricing tiers that cater to all kinds of data teams, with capabilities like accelerators and real-time analytics. And if Tableau doesn’t meet your needs, it has a few alternatives worth noting.

Futuristic concept art for big data solution for enterprises.

Top 6 Enterprise Data Storage Solutions for 2024

Amazon, IDrive, IBM, Google, NetApp and Wasabi offer some of the top enterprise data storage solutions. Explore their features and benefits, and find the right solution for your organization's needs.

Latest Articles

AI in business analytics for big data cloud computing processing. Artificial Intelligence and large language models in data analysis and prediction.

OpenAI, Anthropic Research Reveals More About How LLMs Affect Security and Bias

Anthropic opened a window into the ‘black box’ where ‘features’ steer a large language model’s output. OpenAI dug into the same concept two weeks later with a deep dive into sparse autoencoders.

Ominous virtual AI brain hovering in dark server room with glowing red circuitry.

Some Generative AI Company Employees Pen Letter Wanting ‘Right to Warn’ About Risks

Both the promise and the risk of "human-level" AI has always been part of OpenAI’s makeup. What should business leaders take away from this letter?

A computer screen with program code warning of a detected malware script program.

Cisco Talos: LilacSquid Threat Actor Targets Multiple Sectors Worldwide With PurpleInk Malware

Find out how the cyberespionage threat actor LilacSquid operates, and then learn how to protect your business from this security risk.

The IBM sign logo on Czech Republic Headquarter.

IBM’s Think 2024 News That Should Help Skills & Productivity Issues in Australia

TechRepublic interviewed IBM’s managing director for Australia about how announcements from the recent Think event could impact the tech industry in particular.

Cisco logo near Cisco headquarters campus in Silicon Valley.

Cisco Live 2024: New Unified Observability Experience Packages Cisco & Splunk Insight Tools

The observability suite is the first major overhaul for Splunk products since the Cisco acquisition. Plus, Mistral AI makes a deal with Cisco’s incubator.

Audience at conference hall.

Top Tech Conferences & Events to Add to Your Calendar in 2024

A great way to stay current with the latest technology trends and innovations is by attending conferences. Read and bookmark our 2024 tech events guide.

Intel logo is seen at Intel Corporation's headquarters in Santa Clara, California.

Intel Lunar Lake NPU Brings 48 TOPS of AI Acceleration

Competition for AI speed heats up. Plus, the first of the two new Xeon 6 processors is now available, and Gaudi 3 deals have been cinched with manufacturers.

Concept visualization of observing a blue cluster structure with magnifying glass.

Cisco Live 2024: Cisco Unveils AI Deployment Solution With NVIDIA

A $1 billion commitment will send Cisco money to Cohere, Mistral AI and Scale AI.

Splash graphic featuring the logo of Udemy.

The 5 Best Udemy Courses That Are Worth Taking in 2024

Udemy is an online platform for learning at your own pace. Boost your career with our picks for the best Udemy courses for learning tech skills online in 2024.

Check mark on shield on a background of binary values.

What Is Data Quality? Definition and Best Practices

Data quality refers to the degree to which data is accurate, complete, reliable and relevant for its intended use.

speech on big data analytics

TechRepublic Premium Editorial Calendar: Policies, Checklists, Hiring Kits and Glossaries for Download

TechRepublic Premium content helps you solve your toughest IT issues and jump-start your career or next project.

European Union flag colors and symbols on a printed circuit board.

What is the EU’s AI Office? New Body Formed to Oversee the Rollout of General Purpose Models and AI Act

The AI Office will be responsible for enforcing the rules of the AI Act, ensuring its implementation across Member States, funding AI and robotics innovation and more.

Data science abstract vector background.

What is Data Science? Benefits, Techniques and Use Cases

Data science involves extracting valuable insights from complex datasets. While this process can be technically challenging and time-consuming, it can lead to better business decision-making.

Glowing circuit grid forming a cloud and trickling binary values on a dark background.

Gartner’s 7 Predictions for the Future of Australian & Global Cloud Computing

An explosion in AI computing, a big shift in workloads to the cloud, and difficulties in gaining value from hybrid cloud strategies are among the trends Australian cloud professionals will see to 2028.

speech on big data analytics

OpenAI Adds PwC as Its First Resale Partner for the ChatGPT Enterprise Tier

PwC employees have 100,000 ChatGPT Enterprise seats. Plus, OpenAI forms a new safety and security committee in their quest for more powerful AI, and seals media deals.

Create a TechRepublic Account

Get the web's best business technology news, tutorials, reviews, trends, and analysis—in your inbox. Let's start with the basics.

* - indicates required fields

Sign in to TechRepublic

Lost your password? Request a new password

Reset Password

Please enter your email adress. You will receive an email message with instructions on how to reset your password.

Check your email for a password reset link. If you didn't receive an email don't forgot to check your spam folder, otherwise contact support .

Welcome. Tell us a little bit about you.

This will help us provide you with customized content.

Want to receive more TechRepublic news?

You're all set.

Thanks for signing up! Keep an eye out for a confirmation email from our team. To ensure any newsletters you subscribed to hit your inbox, make sure to add [email protected] to your contacts list.

  • Skip to main content
  • Keyboard shortcuts for audio player

Voice analysis shows striking similarity between Scarlett Johansson and ChatGPT

Bobby Allyn

Bobby Allyn

Scarlett Johansson arrives for The Albies hosted by the Clooney Foundation at the New York Public Library in New York City on Sept. 28, 2023.

Scarlett Johansson arrives at the BAFTA Film Awards in London last year. Johansson says she was stunned when OpenAI unveiled a voice assistant that sounded eerily similar to her. New lab analysis suggests she has a point. Angela Weiss/AFP via Getty Images hide caption

Actress Scarlett Johansson’s voice bears a striking resemblance to OpenAI’s now-pulled “Sky” personal assistant, according to an artificial intelligence lab analysis conducted by researchers at Arizona State University.

At NPR's request, forensic voice experts at the university’s speech lab compared the famous actress’s voice and speech patterns to Sky using AI models developed to evaluate how similar voices are to each other.

The researchers measured Sky, based on audio from demos OpenAI delivered last week, against the voices of around 600 professional actresses. They found that Johansson's voice is more similar to Sky than 98% of the other actresses.

Scarlett Johansson says she is 'shocked, angered' over new ChatGPT voice

Scarlett Johansson says she is 'shocked, angered' over new ChatGPT voice

Yet she wasn’t always the top hit in the multiple AI models that scanned the Sky voice.

The researchers found that Sky was also reminiscent of other Hollywood stars, including Anne Hathaway and Keri Russell. The analysis of Sky often rated Hathaway and Russell as being even more similar to the AI than Johansson.

The lab study shows that the voices of Sky and Johansson have undeniable commonalities – something many listeners believed, and that now can be supported by statistical evidence, according to Arizona State University computer scientist Visar Berisha, who led the voice analysis in the school’s College of Health Solutions and the College of Engineering.

“Our analysis shows that the two voices are similar but likely not identical,” Berisha said.

Berisha said while the study analyzed a vast array of subtle vocal features, it also zoomed in on several particular dimensions of each voice and teased out some differences.

The Sky voice has a slightly higher pitch than Johannson’s; the Sky voice tends to be more expressive than Johannson’s voice in the movie Her , and far more expressive than Johannson’s normal speaking voice; and Johannson’s voice is slightly more breathy than Sky’s.

But overall, the two voices have distinct parallels.

The researchers also analyzed the voices for likely “vocal tracts,” or what size throat, mouth and nasal passages would produce a particular sounding voice, and Sky and Johansson had identical tract lengths, they found.

OpenAI and a Johannson spokesman declined to comment on the lab results.

Deliberate design or coincidence?

In the week since Johansson accused OpenAI of copying her voice in its latest version of ChatGPT, a discussion has flared over the influential tech company’s intentions and whether the AI voice in question was the product of deliberate design or a coincidence.

As the debate raged, OpenAI paused its Sky voice, which was one of five possible voice options as part of the newest ChatGPT version.

OpenAI maintains that Sky was not created with Johansson in mind, saying it was never meant to mimic the famous actress.

“It's not her voice. It's not supposed to be. I'm sorry for the confusion. Clearly you think it is,” Altman said at a conference this week. He said whether one voice is really similar to another will always be the subject of debate.

Johansson, for one, is outraged by the similarity. She said that Altman twice reached out to her attempting to license her voice for ChatGPT. She decided against it.

When the company’s updated personal assistants were unveiled, she said she was shocked over how “eerily similar” the Sky voice resembled her own.

Altman himself amped up the speculation on the day of the release by posting on X one word, “her,” the name of the 2013 romantic sci-fi film in which a lonely man falls in love with a superintelligent computer operating system voiced by Johansson.

OpenAI says it hired a voice actress to help develop the Sky voice months before Altman began reaching out to Johannson, which was first reported by the Washington Post .

OpenAI would not publicly identify the actress, citing personal privacy and potential risks to her safety.

Altman, last year, said Her , featuring Johansson, was his favorite movie about AI, complimenting the film for being “prophetic” about how conversational AI chatbots would one day function in people's lives.

Still unknown is whether Altman ever privately told the team that led the casting for the Sky voice that Johansson should be an inspiration.

OpenAI says Altman was not closely following the casting process. The company says he deferred to OpenAI Chief Technology Officer Mira Murati, who told NPR she didn’t even know what Johannson sounded like until people were comparing Sky to the actress.

But Johansson said when Altman engaged her about potentially having her voice licensed, he pitched it as a way to make the cutting-edge conversational bot “comforting to people” who are perhaps wary of AI services that behave like humans — Altman’s last outreach came just two days before the public launch of the new personal assistant, Johansson said.

"After much consideration and for personal reasons, I declined the offer," she said.

speech on big data analytics

Microsoft Graph Data Connect

A secure, scalable solution that enables you to copy relevant Microsoft 365 datasets into your Azure tenant to enable advanced analytics and insights.

Unlock key analytics scenarios with Microsoft 365 data

speech on big data analytics

Customer relationship analytics

For commercial business leaders, go beyond traditional CRM insights and understand customer interactions and relationships based on communication and collaboration patterns.

Business process analytics

For better operations, see how work really flows through the organization on a day-to-day basis. Pinpoint the manual processes and workflow bottlenecks that should be automated or optimized.

speech on big data analytics

Security and compliance analytics

To secure sensitive data, learn how employees are using and sharing sensitive information. Implement anomaly detection, threat intelligence, audit log analysis, risk management, and legal forensics.

People productivity analytics

For driving HR transformation, export your Viva productivity metrics so you can convert insights into solutions with digital adoption, smart meetings and content, hybrid workplaces, and cultural change.

speech on big data analytics

Data Connect resources

speech on big data analytics

Learn about policies and billing

Azure managed applications allow you to support certain Azure policies, giving customers greater confidence when using your apps.

speech on big data analytics

Find answers to common questions

Read these tips to help you best use Microsoft Graph Data Connect.

Understand enterprise communication patterns with Data Connect

Microsoft and Neo4j introduce a security and compliance analytics scenario that uses Microsoft Graph Data Connect to access Microsoft 365 data (Outlook, Teams, SharePoint, and more) and Neo4j to uncover hidden communication patterns.

speech on big data analytics

Get the latest Data Connect news

Custom encryption with customer owned keys now generally available, sharepoint datasets now in public preview, maximize employee productivity with hcl nippon and microsoft graph data connect, microsoft graph data connect pricing updates, learn how our customers and partners use data connect.

Sura logo

Developing proprietary analytical models to offer businesses customized insurance policies to help to mitigate risk.

neo4j logo

Introducing a security and compliance analytics solution with Microsoft to uncover the hidden patterns of communication in an organization using Outlook datasets.

JLL logo

Increasing GTM and deal velocity for real estate brokers based on available properties data and improving customer relationships.

LotisBlue logo

Providing insights into collaboration and manager effectiveness, and utilizing organizational analyses to enhance diversity, equity, and inclusion outcomes.

Extend Microsoft 365 data into Azure

A confident man walking through a large data center can copy and integrate data at enterprise scale.

Copy data at scale into Azure Data Factory

Ideal for big data and machine learning, Microsoft Graph Data Connect allows you to develop apps for analytics, intelligence, and business process optimization by extending Microsoft 365 data into Azure.

We've detected unusual activity from your computer network

To continue, please click the box below to let us know you're not a robot.

Why did this happen?

Please make sure your browser supports JavaScript and cookies and that you are not blocking them from loading. For more information you can review our Terms of Service and Cookie Policy .

For inquiries related to this message please contact our support team and provide the reference ID below.

A twist in the legal fight between X and eSafety, and Google AI advises eating both glue and 'one small rock a day'

Analysis A twist in the legal fight between X and eSafety, and Google AI advises eating both glue and 'one small rock a day'

A blonde woman with shoulder length hair speaking with two other women behind her on either side

Hello and welcome to Screenshot, your weekly tech update from national technology reporter Ange Lavoipierre, featuring the best, worst and strangest in tech and online news. Read to the end for some truly terrible pizza advice and a singing toilet.

The eSafety vs X case just got messier

The eSafety Commissioner's fight against X over videos of the Wakeley stabbing just got messier, with two new groups granted leave to join the case.

The Electronic Frontier Foundation (EFF) and the Foundation for Individual Rights and Expression (FIRE) have pulled off a rare legal manoeuvre, winning the right to participate in Federal Court proceedings they caught wind of over in the United States.

It's another knock to eSafety, which has been trying to force Elon Musk's company to remove or hide about 65 instances of footage showing a stabbing attack on Bishop Mar Mari Emmanuel since April.

To recap briefly, X initially agreed to geoblock the posts, but refused the regulator's subsequent legal notice, which would have meant global removal.

At that point, all hell promptly broke loose.

Amid an intercontinental slanging match between Anthony Albanese and Elon Musk, the Federal Court granted a temporary injunction, which X ignored, ordering the social media platform to hide the material.

The stalemate lasted more than two weeks until Justice Kennett rejected eSafety's bid to renew the injunction, after hearing arguments that the commissioner had overreached .

It was enough of a commotion to attract two American interlopers, the EFF and FIRE, who jointly applied to "intervene" in the matter on behalf of internet users outside Australia.

Their bid has mostly escaped public notice so far, but this week Justice Kennett decided the parties had a right to be heard, despite arguments to the contrary from eSafety.

Tesla and SpaceX's CEO Elon Musk gestures at an event.

"It's not automatic and it's quite rare in the Australian context for intervention to be granted," said Kevin Lynch, a partner at Johnson Winter Slattery, the firm representing the two groups.

"Our clients won't be arguing for one side or the other," said Lynch, adding that they're only there to bring an "international perspective".

That perspective happens to overlap significantly with the case being made by X, centring on free speech and the appropriate limits of the commissioner's powers.

"If an Australian court makes a global takedown order, it might signal to other countries that they can impose similar orders under their own laws," Lynch said.

In other words: if it can happen in Australia, there's nothing to stop it from happening in China, Russia, Myanmar or Iran.

EFF and FIRE now have a seat at the table, in recognition of the fact that this fight "has a major impact upon their interests as freedom of speech advocates", Lynch said.

I've given up guessing what will happen next here, but you can tune in again on July 24 for the next slightly more crowded bout.

News Corp zigs where others have zagged, cutting a deal with OpenAI

News Corp has decided to allow  OpenAI, the maker of ChatGPT, to train its models on its journalism.

The company, which owns The Australian, Daily Telegraph and Sky News, also struck a similar deal  with Google in April.

In contrast, the ABC, The New York Times and CNN have all opted to block  OpenAI's web crawler.

Logo Open AI is in white on black phone screen in front of a white computer screen with words.

The New York Times is even suing the company , along with ChatGPT investment partner, Microsoft, alleging a breach of copyright.

The dilemma of how to engage with AI companies has managed to split the newsroom so comprehensively because there really is an argument each way.

In the words of TJ Thomson and James Meese, writing in The Conversation , journalism may be "signing its own death warrant" by leaping into the jaws of the machine.

On the other hand, with deals like this in place, whose journalism is more likely to rise to the top in an AI-augmented search query?

Speaking of which…

Google AI says you should eat 'one small rock a day'

Another world-class AI has soiled itself  in public despite assurances from the maker that its new model is definitely toilet-trained.

Google's new "AI Overviews" tool has told people to add glue to pizza  and said geologists recommend eating "one small rock a day".

It's also been spreading dangerous misinformation and the occasional conspiracy theory .

When it comes to the global unboxing of a generative AI tool, we expect nothing less  at this point.

Google is remaining upbeat for now, telling the BBC : "The examples we've seen are generally very uncommon queries, and aren't representative of most people's experiences.

"The vast majority of AI Overviews provide high-quality information, with links to dig deeper on the web."

Granted, the rocks thing is funny. After all, it was written by satirists .

But the serious side to this is that Google, for better or worse, is still most people's gateway to the internet, and we should all be worried if it's malfunctioning.

And if it's all too much…

Then go straight to the sauce on r/pizza , the Subreddit from which Google AI looks to have scraped its now-infamous cooking tip.

Alternatively, if you're feeling brave, behold the terrible visage of Skibidi Biden .

All recipe tips, story tips and unhinged Subreddits can be sent securely via Proton Mail .

  • X (formerly Twitter)

Related Stories

First porn, then social media how age verification tech could cross over.

A child scrolls through a tablet.

Could X be banned from your phone? The worst-case scenario for Elon Musk

man in suit rubbing hands together next to logo of a white x on black background

Fear killer robots? This expert believes you should be more worried about what AI is doing to your mind

Cyborg girl

  • Courts and Trials
  • Information Technology Industry
  • Internet Culture
  • Science and Technology
  • Social Media

IMAGES

  1. 7 Ways to Use Speech Analytics for your business

    speech on big data analytics

  2. The Use of Speech Analytics

    speech on big data analytics

  3. Find out how speech analytics has evolved to become an essential

    speech on big data analytics

  4. Speech analytics

    speech on big data analytics

  5. Features Of Big Data Analytics

    speech on big data analytics

  6. Big Data Analytics Powerpoint Presentation Slide

    speech on big data analytics

VIDEO

  1. Speech Analytics: Better Managing Increases in Volume

  2. Data + AI Summit Keynote, Thursday Part 2

  3. Big Data Analytics for Enhanced Customer Experience in Telecom

  4. Informatica Chalk Talk: Big Data Analytics

  5. BIG DATA ANALYTICS_overview

  6. How speech recognition works in 5 minutes

COMMENTS

  1. Speech: The promise of cross-industry Big Data analytics

    In Big Data Analytics, 1 +1 = 11 I became interested in the prospect of discovering insights from combining unrelated datasets when I came across a study in international development. Two researchers Daniel Björkegren, an economist at Brown University and Darrell Grissen of the Entrepreneurial Finance Lab (EFL) ran a study in 2015 and ...

  2. 10 Best TED Talks on Big Data and Analytics

    1) Shyam Sankar: The Rise of Human-Computer Co-operation. The director of 'Palantir Technologies', Shyam is a data mining innovator who explains that the problem was never man v/s machine but man, machine and the right type of cooperation without dependence on predetermined programs. Intelligence Augmentation, he says, is the way to solve big ...

  3. What Is Big Data Analytics? Definition, Benefits, and More

    Big data analytics is the process of collecting, examining, and analyzing large amounts of data to discover market trends, insights, and patterns that can help companies make better business decisions. This information is available quickly and efficiently so that companies can be agile in crafting plans to maintain their competitive advantage.

  4. 15 Must-Watch Data Analytics TED Talks

    A good forum for this is TED, a nonprofit organization that enlists experts around the world to share stories and insights. TED Talks are thought-provoking and short (usually under 15 minutes each). To learn more about data and data analysis, check out these 15 must-watch TED Talks. 1. A Data Analyst's Journey.

  5. A weathervane for a changing world: refreshing our data and analytics

    Given the theme of this conference, I thought I would say a bit about some fairly unique big data sets the Bank has access to from the data we collect and the data we generate from our own operations. Around 40 are managed through an on-premise data and analytics platform which offers additional computing power to crunch the numbers.

  6. Present Your Data Like a Pro

    TheJoelTruth. While a good presentation has data, data alone doesn't guarantee a good presentation. It's all about how that data is presented. The quickest way to confuse your audience is by ...

  7. 10 TED Talks that will inspire every Data Professional

    10 TED Talks that will inspire every Data Professional. This article was published as a part of the Data Science Blogathon. There is a popular phrase in pop culture, Nor any drop to drink. from the English poem The Rime of the Ancient Mariner by Samuel Taylor Coleridge. There is also a modern version to this phrase quoted by John Allen Paulos,

  8. 10 Amazing TED Talks on Big Data Analytics

    In the TED Talk, Jean-Baptiste talks about the importance of data analytics and how it can be leveraged to develop better ideas of the world and its surroundings. He happens to be the creator of an online tool, which helps you find and comprehend diverse cultural trends. 4. Big Data, Better Data, by Kenneth Cukier.

  9. PDF Straight talk about big data

    Almost by definition, big data analytics means going deep into the information weeds and crunching the numbers with a technical sophistication that can appear so esoteric that senior leadership may be tempted simply to "leave it to the experts" and disengage from the conversation. But the conversation

  10. The Ultimate Guide to Speech Analytics: 5 Power Steps to Understanding

    A comprehensive guide to harnessing speech analytics for insightful voice data analysis. Introduction. In today's digital age, voice has become one of the most pivotal mediums for communication. From customer support to business meetings, we're constantly speaking, and behind those words lies a goldmine of data. Speech analytics is the tool ...

  11. Speech analytics: The value of the human voice

    Speech data offer customer insights that simply aren't available from other sources, helping to identify the causes of customer dissatisfaction and revealing opportunities to improve compliance, operational efficiency, and agent performance. The results include cost savings of between 20 and 30 percent, customer-satisfaction-score ...

  12. 10 Revealing TED Talks on Big Data and Analytics

    Meanwhile, independently run TEDx events to help share ideas in communities around the world. Let's take a look at the top 10 TED Talks on the fascinating topic of Big Data Analytics! 1. 'The birth of a word': Deb Roy (19:52) MIT researcher and CEO and co-founder of Bluefin Labs, Deb Roy wanted to understand how human language is formed.

  13. The Role of Big Data, Machine Learning, and AI in Assessing Risks: a

    The Role of Big Data. It is important to note that all of these remarkable advancements in machine learning are made possible by, and otherwise depend on, the emergence of big data. The ability of a computer algorithm to generate useful solutions from the data relies on the existence of a lot of data.

  14. A Big Data Approach to Public Speaking

    For authenticity, Zandan's team has found that the top 10% of authentic speakers were considered to be 1.3 times more trustworthy and 1.3 times more persuasive than the average communicator. Authenticity is made up of the passion and warmth that people have when presenting. Passion comes from exuding energy and enthusiasm.

  15. How speech analytics is changing customer care

    Customers have so many ways to engage with companies, it may be surprising that voice conversations remain a popular choice. Recent technologies such as voice data analytics are allowing companies to use these personal interactions as a new driver of insight. In this episode of McKinsey Talks Operations —the first in a series of two—host ...

  16. Data Analytics Webinars and Training

    Learn about Data Analytics with BrightTALK. Watch the latest collection of webinars, videos and trainings from industry experts. ... With Big Data and business analytics revenue projected to total $274.3 billion in 2022, data modernization and broader Digital Tran ...

  17. Speech Analysis in the Big Data Era

    In spoken language analysis tasks, one is often faced with comparably small available corpora of only one up to a few hours of speech material mostly annotated with a single phenomenon such as a particular speaker state at a time. In stark contrast to this, engines such as for the recognition of speakers' emotions, sentiment, personality, or ...

  18. Big data analytics and augmentative and alternative ...

    Language is a medium of communication as a sociological trend that gives a significant aspect of culture and a reflection that depicts a country's customs and community. Thus, teachers should instead teach the student's language knowledge, including vocabulary and grammar, in language instruction and incorporate the cultural context to introduce communication concepts, combining with different ...

  19. Big data and democratic speech: Predicting deliberative quality using

    problem in the analysis of political deliberation. Speech acts can be broken up in a multitude of ways, and it is not always immediately obvious which technique to use for different use cases. In this article, we explore several techniques for vec-torizing speech acts into features and training machine learn-ing algorithms on these features.

  20. PDF CURRICULUM AND SYLLABI (2022-2023)

    M.Tech (CSE) - (Big Data Analytics) PROGRAMME SPECIFIC OUTCOMES (PSOs) 1. Ability to design and develop computer programs/computer-based systems in the advanced level of areas including algorithms design and analysis, networking, operating ... Text and Speech Analytics Lab Analytics for Internet of Things Course Type Ver L T P J sio n

  21. How Data Science and AI Help Make Speech Recognition Much More ...

    Speech recognition using data science and AI converts speech signals into text or machine-readable format. It is a technology that enables computers to understand human-spoken speech. It's used ...

  22. English language teaching based on big data analytics in ...

    Here, BDIAI-AAC is speech recognition trained with a network of animated videos. Artificial Intelligence (AI) trained network works on three layers. The input layer is the speech recognition model, which converts speech to string said by the educator. The hidden layer processes the string data as well as matches with the corresponding video ...

  23. Big Data: Latest Articles, News & Trends

    Big Data Big Data Tableau Review: Features, Pricing, Pros and Cons . Tableau has three pricing tiers that cater to all kinds of data teams, with capabilities like accelerators and real-time analytics.

  24. Google Advanced Data Analytics Professional Certificate

    In the U.S. and Canada, Coursera charges $49 per month after the initial 7-day free trial period. The Google Advanced Data Analytics Certificate can be completed in less than 6 months at under 10 hours per week of part-time study, so most learners can complete the certificate for less than $300 USD.

  25. What Is Machine Learning? Definition, Types, and Examples

    Machine learning definition. Machine learning is a subfield of artificial intelligence (AI) that uses algorithms trained on data sets to create self-learning models that are capable of predicting outcomes and classifying information without human intervention. Machine learning is used today for a wide range of commercial purposes, including ...

  26. ChatGPT's voice closely resembles Scarlett Johnasson's, says lab ...

    A new lab analysis conducted for NPR by Arizona State University data scientists shows that OpenAI's "Sky" voice is more similar to Johansson's than hundreds of other actors analyzed.

  27. Microsoft Graph Data Connect

    Copy data at scale into Azure Data Factory. Ideal for big data and machine learning, Microsoft Graph Data Connect allows you to develop apps for analytics, intelligence, and business process optimization by extending Microsoft 365 data into Azure. Learn more. Extract large amounts of data and safeguard it with built-in security.

  28. US Jobs Report May 2024: Live News on Employment, Payrolls

    Here are the median estimates in Bloomberg economist surveys for some of the key data points: a 180,000 gain in nonfarm payrolls, not much changed from 175,000 in April. the unemployment rate ...

  29. A twist in the legal fight between X and eSafety, and Google AI advises

    The eSafety Commissioner's fight against X over videos of the Wakeley stabbing just got messier, with two new groups from the other side of the world granted leave to join the case.