• Search Menu
  • Sign in through your institution
  • Advance articles
  • Author Guidelines
  • Open Access
  • About Current Legal Problems
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Self-Archiving Policy
  • Dispatch Dates
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Article Contents

1. introduction, 2. the law of everything applied to everyone, 3. meaningless law on the books the tension between complete and effective protection, 4. introducing ‘site-level’ flexibility, 5. conclusion.

  • < Previous

Complete and Effective Data Protection

  • Article contents
  • Figures & tables
  • Supplementary Data

Orla Lynskey, Complete and Effective Data Protection, Current Legal Problems , Volume 76, Issue 1, 2023, Pages 297–344, https://doi.org/10.1093/clp/cuad009

  • Permissions Icon Permissions

Data protection law is often invoked as the first line of defence against data-related interferences with fundamental rights. As societal activity has increasingly taken on a digital component, the scope of application of the law has expanded. Data protection has been labelled ‘the law of everything’. While this expansion of material scope to absorb the impact of socio-technical changes on human rights appears justified, less critical attention has been paid to the questions of to whom the law should apply and in what circumstances. The Court of Justice has justified an expansive interpretation of the personal scope of the law in order to ensure ‘effective and complete’ data protection for individuals. This article argues that the attempt to make the protection offered by the law more ‘complete’ risks jeopardising its practical effectiveness and raises doubts about the soundness of the regulatory approach to data protection. In the quest for effective and complete protection, it seems that something must give.

The right to data protection enjoys a privileged position in the EU legal order. 1 The right is strictly interpreted by the Court of Justice of the EU (CJEU) and is given remarkable weight when balanced with other rights and interests. 2 While data protection sits alongside the more established right to respect for private life in the EU Charter, 3 it is data protection rather than its more established counterpart that is specifically referenced in the EU’s General Data Protection Regulation (GDPR). The GDPR, like its predecessor a 1995 Directive, has influenced the adoption of European-style data protection laws globally. 4 Recently adopted EU legislative initiatives in the digital sphere, such as the Digital Markets Act 5 and the Digital Services Act, 6 are all ‘without prejudice to’ the GDPR. 7 Data protection is, therefore, both a cornerstone of EU digital regulation as well as its international poster child and is treated as an ‘issue of general and structural importance for modern society’. 8 Yet, set against this success story of EU data protection law, recurring reservations have been expressed about both its boundaries and its capacity to achieve its objectives in practice. 9

A key concern is that EU data protection has become the law of everything applied to everyone putting compliance with the legal framework, and those charged with its enforcement, under strain. This development of the law is driven, to a large extent, by the jurisprudence of the CJEU. Scholars attribute the broad scope of the law to the need to protect fundamental rights in the context of significant socio-technical changes. 10 Since the 1970s, when data protection laws were first adopted, these laws have sought to address the risks and harms for fundamental rights that stem from personal data processing. 11 At that time, the primary focus was on mitigating the adverse effects that might follow for individuals from holding and controlling files on them and combining information across databases and computer systems. 12 Although, these concerns are still present, the technological and societal landscape has shifted dramatically. Advances in automation, such as the widespread availability of generative AI, will further unsettle the environment to which the law applies and which shapes its application.

To date, the law has expanded to absorb the impact these socio-technical changes might have on fundamental rights with the Court emphasising the need for ‘effective and complete’ data protection in its jurisprudence. This article argues that the broad personal scope of application of the law—the attempt to make the protection offered by the law more ‘complete’, in the language of the Court—risks jeopardising its practical effectiveness and raises doubts about the soundness of the regulatory approach to data protection. 13 In the quest for effective and complete protection, it seems that something must give. While a broad application of the concept of personal data is necessary to protect fundamental rights in light of socio-technical developments, the legislature may need to revisit to whom the law applies and what obligations adhere to distinct controllers under the legal framework. This inquiry also illuminates the need for further reflection and research on the relationship between the law’s scope, compliance with the law by its addressees and its enforcement by regulators.

This argument proceeds in three parts. First, it outlines why it is now argued that data protection has become the law of everything but suggests that the more significant development is the application of the law to everyone, with few exceptions to its material and personal scope of application. While existing legal literature has queried whether the law should apply to everything, much less attention has been dedicated to the question of whether everyone should be subject to the same legal obligations. Second, it demonstrates that this ideal of complete protection is leading to cracks in the legal framework and suggests that these cracks are currently being patched over by Courts and regulators in a way that is itself antithetical to effective data protection. Third, it interrogates whether some of these problems might be addressed by adopting a more flexible approach to data protection interpretation and enforcement. This approach itself raises fundamental questions that must be addressed, suggesting the time may be ripe for a more radical rethink of the data protection framework. 14

Data protection is a regulatory regime that puts in place a series of both rules and principles that must be applied whenever personal data is processed. It regulates the creation, collection, storage, use and onward transmission of personal data, amongst others. 15 At its most basic, when the data protection framework applies, personal data processing can be legitimised if certain conditions are met: there must be a legal basis for processing and adherence to the principles of fair data processing. 16 The legal framework thus imposes compliance obligations primarily on ‘data controllers’ and grants rights to individuals (‘data subjects’). 17 An innovation in the GDPR is the introduction of a suite of meta-regulatory obligations, including an obligation of demonstrable accountability applicable to controllers and various other compliance requirements such as the need to conduct data protection impact assessments and to appoint a data protection officer (DPO) in some circumstances. 18 In the EU, this legislative framework is undergirded by the right to data protection found in the EU Charter of Fundamental Rights. 19 The Court has held in its caselaw that the very act of personal data processing engages the right to data protection and must therefore comply with the requirements set out in Article 8 EU Charter. 20 The legislative framework could therefore be viewed as something that simultaneously facilitates the interference with a fundamental right while allowing for the justification of this interference if its legal requirements are satisfied. 21 From a human rights law perspective, the entire legislative framework functions as a justificatory regime. The implicit aim of the legal framework is to ensure that data processing operations are proportionate in that they pursue a legitimate aim and contain safeguards to ensure they do not go beyond what is necessary to achieve that aim.

Since the adoption of data protection laws by the EU in 1995, the data protection framework has been characterised by its expansive scope of application. The key concepts determining the material scope of application of the EU system are defined broadly, with exceptions construed narrowly. It follows that as societal activity now increasingly has a digital component, data protection has become an almost unavoidable legal framework 22 : data protection is the law of everything, 23 applied to everyone. This is, however, as much a result of a legal evolution as it is a socio-technical one. This section will trace how this has come to pass. The material and personal scope of the rules are defined and interpreted expansively while exceptions to their scope have been construed restrictively. Moreover, attempts to limit this expansionist approach have been rejected by the CJEU. Later sections will explore the implications of this expansionist approach for effective data protection.

A. The Law of Everything

Data protection law applies to the processing of personal data. Any operation or set of operations performed upon personal data, whether by automatic means or not, constitutes processing. It is therefore difficult to conceive of any type of activity with a digital component that would not constitute processing. 24 The only limitation found in the law is that where the processing is conducted manually, as opposed to fully or partly automated processing, the data processing must form part of a filing system which allows for the easy retrieval of an individual’s data file. 25 For the law to apply, however, it is personal data that must be processed.

Data protection law operates in a binary way: it applies when the data processed are classified as ‘personal’ data but does not apply to the processing of non-personal data. 26 Much therefore hinges on what is classified as ‘personal data’. Anonymous data is not treated as personal data whereas data that is pseudonymised, where the data can only be attributed to a specific individual once combined with additional information which is separately held and subject to additional measures to ensure non-attribution, is personal data. 27 The scope of the term personal data is wide, as we shall see, and what constitutes personal data is varied. 28 Personal data is defined as ‘any information relating to an identified or identifiable natural person’. 29 While much of the focus in the existing doctrine is on the issue of identifiability 30 —what does it mean for an individual to be identified and when is an individual identifiable—the other elements of the definition may be equally consequential for its application. Indeed, while it is necessary to disaggregate these elements in order to apply this definition, it is only by considering them together that the overall reach and impact of the law can be determined. Some examples may help to illustrate these points.

Many publishers describe the peer review process as anonymous on the basis that the data being processed—in this case the article distributed for peer review and the comments of the reviewers—do not reveal the identity of the individuals at stake. 31 Anonymity in this colloquial sense is distinct from anonymity as defined in the GDPR. In the peer review context, individuals are deemed anonymous if they cannot be identified or identifiable from the data immediately available to authors or reviewers (an errant reference to previous work revealing an author’s identity, for instance). 32 However, for GDPR purposes, irrespective of whether the article or review allowed for an individual’s immediate identification, they would meet the legal standard for identifiability. An individual is considered identifiable where they can be identified, directly or indirectly using means reasonably likely to be used by the data controller or by any third party. In this example, the identifiability threshold is easily met as the journal editor is able to identify both the author of the article and the reviewer even where they remain unknown to one another. We might be tempted to stop the analysis here, however, the remaining elements of the definition must also be met. If an unreliable author submitted a piece of work that had been generated by ChatGPT and contained inaccuracies attributed to non-existent sources this would nevertheless constitute ‘information’. 33 Instinctively, we might also think that an academic article could not be personal data as its content is not about a particular academic, it is simply the output of their efforts. Early caselaw in the UK, for instance, insisted that personal data must focus on an individual or be biographical in a significant sense. 34 However, the Court endorsed a much more capacious vision of personal data in Nowak finding that information can relate to an individual in so far as it is linked to the individual by reason of its ‘content, purpose or effect’. 35 In Nowak , the CJEU considered that the examination script of a candidate in an open book examination ‘related to’ the candidate as the content of the answers reflected the extent of the candidate’s knowledge and competence; the purpose of the processing was to evaluate their professional abilities and suitability for practice and the use of that information would be liable to have an effect on their rights and interests. 36 The Court also held that the examiner’s comments related to the candidate as, amongst others, their purpose is to record the examiner’s evaluation of the candidate’s performance and they are liable to affect the candidate. 37 This reasoning would apply by analogy to an article submitted for peer review and the comments of the reviewer. Despite the fact that publishers tend to refer to this process as anonymous, suggesting it would fall outside the law’s scope, we would therefore conclude that the peer review process constitutes personal data processing to which the data protection framework applies.

A further example is the act of uploading some content to social media, for instance, a photograph with friends or a video of colleagues. This would again easily meet the threshold criteria for the law to apply. Personal data can be any information: it is not restricted to information that is private or sensitive. 38 This information is linked to them in terms of its content: it is about them and the processing of this information might impact upon them, for instance, if they were photographed with friends during the working day. Even if they could not be immediately identified on the basis of the photograph, they are identifiable at least to the person who uploaded the content online. Notably, they are also potentially identifiable to third parties such as phone companies if, using means likely reasonably to be used, they could combine this data with other data they hold, such as geo-location data, to identify the individuals concerned. 39 Here one might object that the social media user has a right to impart information as part of their right to freedom of expression, thus excluding the data protection rules. However, rather than excluding protected free speech from the scope of data protection law, it is brought within the scope of the data protection framework and tensions between data protection and freedom of expression are reconciled from within the data protection framework. 40 This is similar to the example provided by Advocate General Bobek in his Opinion in X and Z : an individual in a pub who shares an e-mail containing an unflattering remark about a neighbour with a few friends becomes a data controller subject to the GDPR’s obligations. At the hearing in that case, the Advocate General noted that the Commission accepted that even the incidental processing of personal data triggers the GDPR’s rights and obligations and that it had difficulty explaining where the limits of the law lie. 41

At its more extreme, the literature provides examples of data which can plausibly be argued to meet the definition of personal data although intuitively ‘far from being “personal”’. 42 Purtova takes the example of a smart city project in the Dutch city of Eindhoven initiated by a public–private collective to anticipate, prevent or de-escalate anti-social behaviour on Stratumseind, a street known for its social life. The data used for this behavioural regulation is gathered from multiple sources and includes weather data, such as rainfall per hour and wind direction and speed. Purtova reasons that weather contains information which is then datafied; that this relates to individuals as it can be used to assess and influence behaviour deemed undesirable and that this information, when combined with other data collected via sensors, can lead to the identification of individuals. Indeed, this is the very purpose of the Stratumseind 2.0 project. She proposes that weather data could therefore be classified as personal data. Others have applied similar analysis to other environmental artefacts, such as wastewater. 43 Once we start to look around us to apply this definition we see that almost all data is potentially personal data if applied to evaluate or influence individuals thus making data protection the law of everything (or almost everything).

This development is desirable if we consider that it is no longer simply data about an individual that might be leveraged to impact upon their rights. 44 Take, for instance, synthetic or artificial data derived from personal or non-personal data to create replica datasets. Such synthetic data may be used to make significant and impactful decisions about identified individuals. In such circumstances, it could be classified as personal data under the GDPR. 45 While this might seem to confirm Purtova’s concerns that data protection law is the law of everything, Dalla Corte highlights that information that relates to someone as a result of its impact on them will not necessarily be personal throughout its entire lifecycle. 46 For instance, data about the performance of a vehicle is non-personal data until the point when it relates to someone, such as when it is used to evaluate a driver’s performance. 47

A further feature of the legal framework is that while ‘personal data processing’ is potentially all encompassing, the limited derogations to the material scope of the GDPR are construed restrictively. Data processing for EU external action, national security purposes and processing by competent authorities for law enforcement purposes fall outside of the GDPR’s ambit, 48 as does data processing undertaken by the EU institutions. 49 The only other derogation is for data processing for ‘purely personal or household purposes’. 50 The uploading of content to social media might seem to constitute such a purpose, however, this is not necessarily so as the Lindqvist case demonstrates. Mrs Lindqvist was a church catechist in Sweden who, as coursework for an evening class on computer processing, uploaded short descriptions of her colleagues to the church website. She was criminally prosecuted for illegal data processing and, amongst the many defences invoked in the ensuing court proceedings, was that Mrs Lindqvist was engaged in ‘purely personal or household’ processing. The CJEU acknowledged that Mrs Lindqvist’s activities were charitable and religious rather than commercial 51 but refused to apply this derogation. It considered that the information concerned was ‘clearly not’ carried out in the course of an activity relating to the private or family life of individuals as the internet publication resulted in the data being made accessible to ‘an indefinite number of people’. 52 In later jurisprudence, the Court found that when a home security camera used for personal security captures not only the home but the public footpath outside, it too cannot benefit from this derogation. 53 In this way, many of the routine data processing operations of individuals are brought within the law’s fold.

As this section suggests, the concept of personal data has the capacity to bring all impacts of data usage on the fundamental rights of individuals within the remit of data protection law. Given that the law is concerned with the protection of rights rather than the protection of data per se, this expansion is desirable and legitimate. For instance, at the point at which weather data is used to assess an individual’s potential criminality, it is appropriate that legal protections are activated. However, as Mrs Lindqvist’s case suggests, this does raise questions about to whom the law applies and the extent of their obligations under the framework. It is these questions of scope that require further consideration and to which we shall now turn.

B. Applied to Everyone? The Data Controller and Joint Controllership

To whom does this vast legal framework apply? Data protection law distinguishes between data subjects, the individuals whose personal data are processed, and data controllers and processors, who initiate and undertake the data processing. Data controllers act as the ‘intellectual lead’ 54 or brains behind the data processing operation—determining the purposes and means of processing 55 —while data processors act as the brawn—conducting the data processing under the instruction of the data controller. 56 Primary legal responsibility is attributed to the data controller, although the GDPR does confer specific responsibilities on the data processor for some tasks. 57

While these concepts and the division of labour between them appear clear, already in 2010 it was noted that their concrete application was becoming increasingly complex leading to uncertainty regarding responsibilities under the framework. 58 The main reason for this complexity is that modern data processing is itself complex 59 : unlike the conditions that prevailed when data protection laws were first adopted, control over processing is no longer centralised 60 or exercised by singular actors who use available technologies for easily distinguishable purposes. 61 Moreover, technologies confound the distinction between means and ends that the GDPR deploys: determining the appropriate technical tools for the job (de facto a task often assumed by the processor) can have a significant bearing on the purposes to which those tools can be put and, ultimately, the functioning of a socio-technical system.

This messiness of the socio-technical environment is recognised, to some extent, through the concept of joint controllership: controllers can determine the purposes and means of processing alone or ‘jointly with others’. This joint control can take different forms: it can result from a common decision on purposes and means or from ‘converging decisions’, where complementary decisions have a tangible impact on the purposes and means of processing and the processing would not be possible without the participation of the jointly controlling entities. 62 For our purposes, what is significant is that the concept of controllership is both defined and interpreted expansively. Per the definition, a controller or a joint controller can be a ‘natural or legal person, public authority, agency or any other body’. 63 Like other forms of regulation, such as environmental regulation and consumer protection laws, data protection is a form of mixed-economy oversight: the law, therefore, applies equally to public actors, such as local authorities or departments of government, as it does to private enterprise. For the latter, there is little differentiation made between large multinational companies and the local corner shop. 64 Moreover, the law brings individuals within its reach as data controllers, subject to the limited derogation for purely personal and household processing noted above.

The CJEU has had opportunity to interpret the notion of controllership on numerous occasions, taking these opportunities to stretch the concept to ensure the ‘complete and effective’ protection of individuals. We could locate the foundations for this broad approach in the Court’s Google Spain judgement. While this ruling is best known for its recognition of a ‘right to be forgotten’ in EU data protection law, its finding that Google search engine is a data controller was also momentous. 65 Notably, in an earlier advisory opinion on the application of data protection law to search engines, the advisory body comprised of data protection regulators (the Article 29 Working Party) had considered that where a search engine acts purely as an intermediary, the principle of proportionality requires that it should not be considered the principal controller of the content. 66 However, the Court implicitly rejected analogies with other areas of law where intermediaries such as Google Search enjoy quasi-immunity from liability for hosting illegal content until they have actual or constructive awareness of such content. Google had argued that when providing hyperlinks to content already available online it did not differentiate between links to primary publications containing personal data and those that did not. 67 The Court applied the controllership test broadly, finding that in the context of this linking activity it is the search engine operator that determines the purposes and means of the personal data processing. 68 It considered that it would be contrary to the clear wording of the definition of data controller and its objective to exclude search engine operators, going on to note that the role of search engine operators is distinct from primary publishers and that the former is liable to affect fundamental rights to privacy and data protection ‘significantly and additionally’ compared with the latter. 69 Importantly, the Court considered that the objective of the broad definition of data controller is to ensure ‘effective and complete protection of data subjects’. 70

Later jurisprudence brought this concern for the ‘effective and complete’ protection of individuals to the fore, sometimes at the expense of the law’s literal meaning. 71 In Wirtschaftsakademie ( Facebook fan pages) the Court held that the administrator of a fan page on Facebook was a joint controller. 72 Visitors to the fan page, both Facebook users and non-users alike, had data collecting cookies placed on their devices by Facebook and the Court reasoned that the fan page operator provided Facebook with this opportunity. 73 Moreover, the fan page operator also defined the parameters for the statistical analysis of visitor’s data conducted by Facebook, thereby contributing to and determining the purposes and means of processing. 74 The later Fashion ID case, where the Court considered whether the integration of a Facebook social plug-in (the Facebook like button) into a website was sufficient to make the website operator a joint controller, confirmed that the definition of parameters for data analytics by a Facebook fan page was not what was decisive. 75 In Fashion ID , the mere presence of the piece of Facebook code on the website—triggered when the website was consulted—was sufficient to transmit data from the website user’s device to Facebook. The website visitor did not need to click on the plug-in or be a Facebook user for this to occur. 76 The Court was asked whether embedding a piece of Facebook code on a website was sufficient for the website operator to constitute a data controller, particularly given that once the data was transmitted to Facebook the website operator had no influence on the subsequent data processing. The Court broke the data processing operations down into segments. It determined that Fashion ID exercised joint control over the collection and transmission of the personal data of visitors to its website, a first segment, however, it was not responsible for subsequent processing operations, over which it had no influence. 77 Specifically with reference to the means of processing, the Court emphasised that Fashion ID was ‘fully aware’ of the fact that the embedded plug-in served as a tool for the collection and transmission of personal data to Facebook. 78 The Court concluded that through the embedding of the plug-in on its website, Fashion ID exerted ‘decisive influence’ over the data processing that would not have occurred in the absence of the plugin 79 and that there was joint control over the data processing operation. 80 In support of this conclusion, the Court pointed to the mutual benefit the data processing provided to Fashion ID and Facebook Ireland. 81

In both of these instances, the fan pages and website operators did not ‘hold’ or have access to the data undergoing processing, thus rendering them incapable of complying with the vast majority of the regulatory framework (a point to which we shall return). The Court addresses this point, finding that the classification of data controller does not necessitate that the data controller has access itself to the personal data collected and transmitted. 82 Implicitly, the role of facilitating and benefiting from data processing is sufficient to incur legal responsibility. 83 Jehovan todistajat offers more explicit confirmation of this understanding of controllership in the context of the relationship between the Jehovah’s witness community, its congregations and its preaching members. 84 In the conduct of their preaching activities, preaching members of the Jehovah’s witness community (the community) took notes regarding the people they met. These notes served the dual purpose of acting as an aid for future visits and to compile a ‘refusal register’ of those who did not want to be contacted again. The community and its congregations coordinated this preaching activity by creating maps allowing for the allocation of areas between preaching members and keeping records about preachers and the number of leaflets they distributed. 85 While the preaching members received written guidelines on note-taking published in a magazine for members, they exercised their discretion as to the circumstances in which they should collect data; which data to collect; and how those data are subsequently processed. 86 Yet, the role of the community in ‘organising, coordinating and encouraging’ this preaching activity was sufficient for it to be deemed a joint controller. 87

In Jehovan , we might distinguish between the overarching aim or purpose of data processing—to encourage new members to join the community—which is determined by the community and more essential elements of the processing (such as which data to be processed and who should have access to the data) which was determined by the preaching members. 88 The orchestrating role of the community is sufficient to establish responsibility under data protection law, without the need for access to the data 89 or to have produced written guidelines around data processing. 90 This is perhaps unsurprising given that the preaching was carried out in furtherance of the overarching objectives of the community—to spread its faith—and the community acted as the ‘intellectual lead’ on the data processing. In a subsequent case, the Court is asked to determine whether a standard-setting organisation that offers its members a standard for managing consent specifying how personal data is stored and disseminated is a data controller. 91 The way in which the standard-setting organisation ‘organises and coordinates’ personal data processing through this standard seems highly likely to meet the criteria set by the Court in Jehovan .

This low legal threshold for controllership, when combined with technical–organisational developments, particularly the increasingly interconnected nature of information systems and markets, will therefore make joint controllership more prevalent. 92 This has the benefit of enabling regulators to more easily bring complex data processing structures within their regulatory remits, as was the case in the standard-setting investigation noted above. However, it also brings more individuals and tangential actors within the law’s fold. We might conclude that, to the extent that it is necessary to establish ‘which level of influence on the “why” and “how” should entail the qualification of an entity as a controller’, 93 the answer is very little. This caselaw leaves one with the impression that everyone is responsible for data processing from the facilitators (such as Fashion ID) to the orchestrators (such as the community). Data protection is, it seems, the law of everything applied to everyone. We will return to the question of whether this is desirable below.

C. Failed Attempts to Limit the Law

This expansive evolution of the scope of data protection law has been challenged. Prior to the development of European case law, British courts tended to interpret its material scope more restrictively. The notion of processing was interpreted narrowly to exclude the act of anonymising personal data on the grounds of ‘common sense and justice alike’ 94 while information only constituted personal data relating to someone when it was private or biographical in a significant sense. 95 At European level, pushback has come from within the Court in the Opinions of its Advocates General.

Advocate General Sharpston sought to keep the material scope of the rules in check by proposing alternative readings of the concepts of automation, processing and personal data in her Opinions. It is recalled that the GDPR applies to personal data that is processed manually as part of a filing system or that is processed ‘wholly or partly by automated means’. In an early case where the right to access documents was pitted against the data protection rights of those featuring in the documents, she sought to avoid a balancing of interests by suggesting that the data protection rules did not apply. The retention and making available of these meeting minutes using a search function was not, she opined, ‘automated’ processing. Her reasoning was that throughout this process the ‘individual human element plays such a preponderant part and retains control’ 96 in contrast to ‘intrinsically automated’ processing operations such as the loading of a website. The search function, like the use of an electric drill, could be replicated by humans but simply with less efficiency. 97 This reasoning was undoubtedly influenced by the Advocate General’s opinion that ‘the essence of what is being stored is the record of each meeting, not the incidental personal data to be found in the names of the attendees’. 98 Had the Advocate General’s reasoning been accepted, the range of processing operations to which the data protection framework would apply would have been dramatically limited. 99 The Court did not follow, or even acknowledge, the Advocate General’s attempt to place boundaries around the notion of personal data processing. 100

When the Court was asked to consider whether the legal analysis found in an administrative note concerning the immigration status of several individuals constituted personal data in the YS, M and S case, Advocate General Sharpston again proposed to restrict the law’s material scope. As in Bavarian Lager she emphasised the human dimension of the processing. Legal analysis is a process controlled entirely by individual human intervention through which personal data (in so far as they are relevant to the legal analysis) are assessed, classified in legal terms and subjected to the application of the law, and by which a decision is taken on a question of law. 101 Once again, the Court did not acknowledge this perspective. It did, however, find her opinion on what constitutes personal data more persuasive. Her opinion suggested that the definition of personal data should be confined to ‘facts’ about an individual, whether objective (e.g. weight in kilos) or subjective (underweight or overweight), 102 to the exclusion of the reasoning or explanation used to reach such conclusions or facts. 103 She was unconvinced that the definition of personal data should ‘be read so widely as to cover all of the communicable content in which factual elements relating to a data subject are embedded’. 104

The Court concurred finding that legal analysis is not information relating to the applicant but is, at most, information about the assessment and application of the law to the applicant’s situation. 105 Like the Advocate General, it supported this conclusion by reference to the broader legal framework, suggesting that its interpretation was borne out by its objectives and general scheme. 106 It reasoned that in order to promote the law’s objectives of protecting fundamental rights, including privacy, the law gives individuals the right to access data to conduct ‘necessary checks’ (to check its legality; to rectify or delete in some circumstances). In this instance, as the legal analysis itself is not liable to be subject to the checks set out in the right to access, such as an accuracy check, granting access to the data would not serve the law’s purpose. 107 The Court’s reasoning in this case is flawed: it rendered the scope of application of the legal framework contingent on whether substantive rights can be exercised in a particular scenario although the scope of the legal framework is a logically prior question. 108 What is notable, however, is that YS is a ‘rare instance in which the Court has read the concept of “personal data” restrictively’. 109

However, in the later Nowak case, the Court seems to recognise this misstep as it differentiates explicitly between ‘classification’—the scope of the rules—and ‘consequences’—the substantive responsibilities they impose. It held that whether the answers and exam comments could be classified as personal data should not be affected by the consequences of that classification. 110 To confirm this point, the Court emphasised that if data are not personal data they are entirely excluded from data protection’s principles, safeguards and rights. 111 While the Court made a weak reference to YS and M and S , intimating that it might be distinguished on the facts, its findings and reasoning in Nowak stand in opposition to YS . At best, the current status of YS is ‘somewhat uncertain’. 112 However, given the Court’s later expansive line in Nowak , it is perhaps more reasonable to treat YS as an anomaly.

The scope of the notion of controllership has also been subject to contestation. In Facebook fan pages , the referring court hinted at the possibility of a ‘third way’ to attribute responsibility for data processing beyond controllership and joint controllership. It considered that the operator of a fan page was not a controller but queried whether the action of choosing which operators to engage with should entail some responsibility for the fan page host. 113 The Court simply considered the fan page operator to be a joint controller. In their opinions on data controllership, Advocates General also expressed their unease about the expansive personal scope of the law, albeit without fully articulating their concerns. In Google Spain , the Advocate General proposed a knowledge component to controllership 114 : the data controller should be aware in some ‘semantically relevant way’ of what kind of personal data they are processing and why 115 and then process this data ‘with some intention which relates to their processing as personal data’. 116 Advocate General Bobek was most forthright in expressing his concerns, openly querying whether this strategy of broadly interpreting controllership—making ‘everyone’ responsible—would enhance effective protection. 117 The Court was not ‘faced with the practical implications of such a sweeping definitional approach’. 118 The Advocate General does not, however, develop how the broad scope of the law might hinder its effectiveness or what the practical implications of this broad scope might be. Having shown how judicial developments in the EU mean that data protection law might not be credibly classified as the law of everything applied to everyone, we now turn to examining this question: what are the consequences of this broad scope for the effectiveness of the law.

The scope of data protection law has been interpreted expansively with a view to preventing human rights infringements. To achieve their preventive function, Simitis argued that these rules should be strictly applied but, primarily, that they adapt to ‘both the exigencies of an evolving technology and of the varying structural as well as organisational particularities of the different controllers’. 119 No doubt the Court considers that it has remained true to this mission in its jurisprudence. However, this approach is increasingly questioned. Advocate General Bobek suggests that the current approach is ‘gradually transforming the GDPR into one of the most de facto disregarded legislative frameworks under EU law’. 120 Similar reservations are expressed in the academic literature. Bygrave and Tosoni note that the law’s enormous scope of application is ‘perhaps beyond what it can cope with in terms of actual compliance and enforcement’. 121 Nolan observes that the Court’s approach appears to assume that ‘by applying data protection law to more actors better protective outcomes will be achieved’ 122 while Koops more explicitly declares data protection law to be ‘meaningless law on the books’ as a result of, amongst others, its broad scope. 123 Therefore although the Court justifies its expansive application of the law on human rights grounds, this quest for completeness may be in tension with the law’s effectiveness and the attainment of these human rights objectives. In other words, we must query whether data protection law can be both complete and effective.

A. Assessing the Effectiveness of the Law

When we test this claim—that data protection law can be all encompassing or effective but not both—we are immediately faced with the challenge of determining appropriate parameters to assess the effectiveness of the law. As one data protection authority has noted, while the volume of work they undertake is ever intensifying, what remains elusive ‘is any agreed standard by which to measure the impacts and success or otherwise of a regulatory intervention in the form of GDPR that applies to literally everything’. 124 While the idea of measuring the impact of human rights and the methodologies used remain contested, scholars such as De Búrca have sought to break the deadlock by proposing an experimentalist account of human rights to assess their effectiveness. 125 However, such accounts speak predominantly to how Treaty and Charter rights, rather than the legislative frameworks that implement them, have been harnessed for social change. Policymakers, journalists and civil society organisations tend to speak of the effectiveness of the GDPR in terms of the complaints resolved by authorities and the remedies and sanctions imposed. 126 The number of complaints lodged by data subjects was also deemed by the European Commission to be an appropriate indicator of the impact of the GDPR to be taken into consideration when monitoring the implementation of the law. 127 However, the number of complaints alone provide an inconclusive indication of success. Not only is data gathering in this area very inconsistent, detracting from its reliability 128 but, more fundamentally, interpreting this data is difficult. A low number of complaints or insignificant fines could be indicative of either a dysfunctional system of enforcement or widespread compliance with existing obligations. 129 Equally, while by August 2023 an impressive 1.4 million requests for the erasure of links from Google’s search engine have been submitted pursuant to GDPR, 130 this figure gives us only a small insight into the overall exercise of individual rights and tells us nothing of who is exercising their rights and whether these requests were appropriately handled. 131 In assessing the effectiveness of the law, we might then return to a simple test that asks what are the law’s objectives and queries whether these objectives have successfully been attained. 132

The stated objectives of the GDPR are two-fold: to remove impediments to the free flow of personal data within the EU and to protect fundamental rights, in particular data protection. 133 These different ambitions of data protection are often not mutually exclusive and are sometimes in tension. 134 The GDPR’s fundamental rights objective has become dominant in its interpretation in recent years. 135 However, parsing this fundamental rights objective further, we can see that the content of the right to data protection itself remains contested. The right has been characterised in different ways: as promoting individual control over personal data; ensuring ‘fair’ processing of personal data; a right which simply guarantees legislative safeguards for data processing; and as instrumental for other rights. 136 Moreover, the Court has explicitly acknowledged that not all violations of the GDPR entail a fundamental rights interference, 137 thereby confirming that there are provisions of the law that do not have a fundamental rights character.

Whether the law is successful in achieving the protection of fundamental rights, in particular data protection, may differ depending on which of these conceptualisations of data protection one prefers. However, for simplicity, assuming that the GDPR gives at least partial expression to the right to data protection, 138 we might then infer that compliance with the GDPR would itself achieve the law’s objective of fundamental rights protection. This vision of effectiveness equates legal compliance with success. This assumes that the legal rules are the ‘right’ ones to achieve the objectives of data protection laws. In other words, by achieving high levels of compliance we would achieve the law’s objectives of fundamental rights protection. However, existing legal scholarship appears to challenge this assumption. Bygrave, for instance, observes a paradox in the enactment of ‘increasingly elaborate legal structures’ for privacy while privacy protection is increasingly eroded. 139 Richards similarly queries why people are so concerned about the Death of Privacy when there is so much privacy law. 140 There is also some limited empirical evidence to suggest that modern data protection frameworks encourage ‘symbolic compliance’ by allowing the information industry to apply the law in a way that aligns to corporate rather than public objectives. 141 While this empirical work was conducted in the USA, its findings are also said to reflect on the GDPR. Further empirical research is required to assess how the law is being received on the ground. early evidence suggests that rather than even encouraging symbolic compliance there remains widespread non-compliance with the law in reality. Writing in 2022 Lancieri examined the 26 independent empirical studies to assess the impact of the GDPR and the California Consumer Protection Act on legal compliance and concluded that non-compliance remains widespread. 142 Such non-compliance includes obvious violations, for instance, that 85% of Europe’s most accessed websites continued to track users even after they had opted out of such tracking. 143 Thus while compliance requirements will undoubtedly play an important role in securing the application of the GDPR, 144 this suggests that over-reliance on controller compliance over enforcement would be erroneous. 145 Yet, even where the desire to comply is present, the law’s complete scope makes compliance with its provisions impossible in some circumstances (B) while rendering the enforcement needed to complement compliance strategies more challenging for regulators (C). In this way, complete protection is pitted against effective protection.

B. The Practical Impossibility of Compliance

It follows from the Court’s jurisprudence that the broad scope of responsibility it envisages renders compliance with the law practically impossible in some circumstances, one of Fuller’s characteristics of a bad law. 146 The practical impossibility of compliance is best illustrated through the Court’s caselaw on joint controllership, discussed above. It follows from this case law that in networked situations, for instance, where a student society uses Facebook to host a fan page, data controller responsibility is segmented. The student society would need to comply with data protection law for any element of the processing that it facilitates while Facebook would need to comply for any data processing operations it undertakes jointly with or independently of the student society. Some provisions of the GDPR apply awkwardly to this situation. For example, the requirement found in Article 26 GDPR which stipulates that joint controllers should arrange between them their respective responsibilities either functions as a legal fiction when applied between big technology platforms and natural persons or is widely disregarded. Both scenarios detract from the law’s credibility and legitimacy. However, joint controllership also leads to situations where it will be impossible in practice for the student society to comply with all of its obligations under data protection law. The Court has, for example, held that joint controllership is not contingent on the controllers having access to the data being processed. 147 Without such access the student society cannot comply with requests from individuals in relation to that data (such as data access, rectification or deletion requests). This necessarily raises the question of whether an individual or entity ought to be designated a data controller if they do not have or have not had access to the data that renders them legally responsible. In principle, as a joint controller the student society or individual could require others to provide such access pursuant to Articles 26 and 28 GDPR. Indeed, companies such as Meta have put in place a contractual addendum indicating that Meta will retain responsibility for compliance with data subjects’ rights that necessitate data access. 148 This fills the legal lacuna in this instance but it is noteworthy that this renders the compliance of the student society with the GDPR contingent on Meta’s contractual wishes. More broadly, this approach to controllership assumes that cooperation is feasible given the number of entities deemed joint data controllers pursuant to this approach and the often asymmetrical power relations between them. The same can be said for legal requirements that require no data for compliance, such as the GDPR’s transparency requirements. 149 Mahieu and Von Hoboken provide the example of the following transparency notice to illustrate this point evocatively:

We collect your IP address and browser-ID and transfer this personal data to Facebook. We do not know what Facebook does with the data. Click here to proceed.

By segmenting responsibility to ensure complete data protection, key provisions of data protection law are rendered meaningless in the process. The Court had been warned of this consequence by one of its Advocates General who considered that, when it came to controllership, a conceptual lack of clarity upstream about who was responsible for what processing might cross ‘into the realm of actually impossibility for a potential joint controller to comply with valid legislation’. 150 This warning did not influence the Court.

The Opinions of the Advocates General in these cases on joint controllership give some insights into the Court’s thinking in developing responsibility in this way. The ambition, it seems, was a policy one: that by making more individuals and entities responsible for data protection compliance this would introduce some bottom-up pressure on more significant data controllers to take compliance seriously. This approach has been subsequently vindicated to some extent as it has given data protection regulators more leverage to apply the law to address systemic data protection concerns. For instance, civil society organisation NOYB submitted 101 complaints to various European data protection authorities arguing that website operators that used Google Analytics and Facebook Business Tools transferred data illegally from the EU to the USA. In its initial advisory assessment of this practice, the European Data Protection Board (EDPB) emphasised that each website operator must ‘carefully examine whether the respective tool can be used in compliance with data protection requirements’. 151 Moreover, given the difficulties experienced in the use of the GDPR’s pan-European enforcement mechanism (the one-stop-shop), 152 this approach also potentially returns competence to national data protection authorities if the data processing operations of the joint controller affect residents in that State only. 153

Therefore, while this approach is not without merit, what is overlooked in the equation is that the business models in question co-opt individuals and entities into data processing but without giving them any real stake or meaningful control in the data processing operations. The real locus of power over data processing lies not with the millions of joint controllers who embed such analytics tools in their content and services but with the operators who provide them. One might also wonder how the data subject stands to benefit from the designation of an entity that cannot comply with core data protection rights, such as access and erasure, as a data controller. Joint controllership as conceived by the Court in Jehovan , extending responsibilities to those who coordinate and orchestrate data processing operations, appears to more accurately capture the real site of power in digital ecosystems and therefore offers a more effective leverage point for regulatory intervention. Indeed, relying on the Jehovan logic, the Belgian regulator has analysed the data processing operations of almost the entire online advertising technology ecosystem by focussing on a critical apex entity, the Interactive Advertising Bureau (IAB). 154 We might be more willing to accept the practical impossibility of compliance with the law’s provisions if it delivers real gains for fundamental rights protection.

C. Data Protection Authorities as the Regulators of Everything

Securing effective data protection in Europe will require an appropriate blend of private enforcement (including by civil society actors), 155 compliance by regulated data controllers and public enforcement by regulators. The regulator alone is not responsible for the full application of the law. However, it could be argued that regulators continue to play an out-sized role in the success or failure of the EU data protection regime as the extent to which follow-on private enforcement is initiated or regulatees voluntarily comply with the law is dependent on their actions. It is therefore significant that the law’s broad scope of personal application also poses challenges for the regulators tasked with interpreting and enforcing its provisions.

At a very basic level, the volume of cases that regulators deal with has increased significantly since the entry into force of the GDPR, suggesting a ‘new level of mobilisation on the part of individuals’ to tackle data misuses. 156 For instance, while in 2013 the Irish regulator received 910 complaints between May and December 2018, following the entry into force of the GDPR, it saw this number triple. 157 Regulators report on the number of complaints that they receive annually in their Annual Reports and these figures have been collated on occasion at European level. 158 While this mobilisation is to be welcomed, regulators may lack the capacity to handle the increase in demand for their services. In response to a questionnaire of the EDPB, 82% of regulators explicitly stated that they do not have enough resources to conduct their activities. 159 In this sense, with finite budgets and human resources at their disposal, the broad scope of the law means regulators struggle to fulfil their legal supervisory obligations. The solution may lie, in part, with providing regulators with more resources.

Yet, while a lack of resources no doubt exacerbates the enforcement challenge for regulators, the problem may also be one of delimiting appropriate regulatory boundaries when data protection law is applied to everyone. It is not simply the number of regulatees that might complicate the work of regulators but also that the regulated community is extremely diverse. We might contrast this with other areas of regulation, such as energy regulation where the regulator deals primarily with energy firms, or even competition law, where the regulator deals only with ‘undertakings’ engaged in economic activity. 160 Data protection regulators must regulate, amongst others, the activities of individuals, charities, political parties, public authorities and commercial actors. This diversity of regulatees is significant as regulation—and regulators—benefit from the existence of a ‘cohesive interpretive community’. As Black emphasises, for rules to work, that is to apply in a way that would further the overall aims of the regulatory system, the person applying the rule has to ‘share the rule maker’s interpretation of the rule; they have to belong to the same interpretive community’. 161

A lack of cohesion amongst regulatees may make a common understanding of the law more difficult to attain resulting in over- or under-compliance. Tales of such compliance misadventures are plentiful in data protection law. In 2019, for example, the Irish regulator needed to reassure publicly the Irish General Post Office that maintaining public bins outside its premises would not violate GDPR. 162 The more diverse the regulated community, the less the regulator will be able to assume some minimum levels of understanding of the rules and the more demanding its task becomes. Moreover, it is apparent that, as a result of the diversity of regulatees under the law, some legal requirements are awkwardly applied to individuals. Not only are many of the law’s requirements predicated on centralised control over a file, 163 but they also assume that a data controller will have certain organisational and bureaucratic capacities at its disposal. The GDPR introduced a wide range of ex ante meta-regulation obligations that apply to controllers, such as the record keeping needed to comply with demonstrable accountability requirements 164 and the requirement to appoint a DPO in some circumstances. 165 As Nolan observes, implicit in these responsibilities is the assumption that controllers are ‘commercial, institutional or bureaucratic entities, if controllers are to ever be able to meaningfully comply with their obligations’. 166 While some of these requirements contain exceptions for small- and medium-sized enterprises (and implicitly individuals), this is not universally true. 167 In short, by detracting from common understandings of the law and stretching the application of its requirements to all regulatees, the lack of cohesion in the regulated community can detract from the effectiveness of the law.

The diversity of the regulated community also puts pressure on regulators because they deal with a huge variety of regulatory issues. Recent examples include the systemic issues arising in data-centric industries, such as the ongoing legal investigations into the AdTech industry across Europe 168 ; assessing the compliance of public data processing initiatives, such as the use of contact tracing applications at the peak of the Covid-19 pandemic 169 ; complaints by individuals about institutional data controllers 170 ; and interpersonal complaints, including about the use of technologies such as smart doorbells and home security devices. 171 The diversity of contexts in which the law applies and actors within its regulatory ambit renders it impossible for regulators to provide general and authoritative guidance that is appropriate to all. Consider, for instance, the meaning of open-ended principles, such as fairness, found in the GDPR. 172 This concept could encompass both procedural and substantive fairness 173 and has been interpreted in differing ways by national regulators to date. 174 We might interpret fair processing differently if it is our neighbour processing our data compared to an international company such as Meta. Moreover, the capacity required to interpret open-ended principles such as fairness appropriately scales down badly, with individuals and small enterprises less likely to have the knowledge and resources at their disposal to do this.

In conclusion, while it is not possible to conclude authoritatively that the pursuit of complete data protection has rendered data protection ineffective, it is apparent that this completeness is in tension with effectiveness in two key ways. First, it has rendered compliance with the law’s requirements practically impossible in some circumstances. As we shall see in the next section, the Court’s response to such practical impossibility has been to develop an ad hoc rationalisation of the law—the responsibilities doctrine, a response which itself jeopardises the law’s effectiveness. Second, the law’s broad scope has further diversified the regulated community, making it more difficult for regulatees to have a shared understanding of the law and for regulators to exercise effective oversight of the broad array of data processing operations they must supervise. We will now consider how this problem might be addressed.

Can the law be both complete and effective, as the Court aspires? The literature on the effectiveness of regulatory instruments is surprisingly sparse. Not all problems with the GDPR’s enforcement stem from its broad scope. As Lancieri highlights, information asymmetries between regulators and data controllers undermine compliance and enforcement as do high levels of market power in data-related markets. 175 Some problems in Europe also stem from the difficult cooperation between regulators foreseen by the GDPR. 176 However, the problems with the law’s effectiveness also stem, at least in part, from the over-inclusiveness of the law at rule level (in particular, as a result of the expanded scope of responsibility under the law). Bardach and Kagan suggest that such over-inclusiveness at rule level might be mitigated by a flexible application of the law at ‘site-level’. 177 Black similarly observes the reflexive relationship between rules and enforcement: it may be possible to use over-inclusive rules knowing that their application might be tempered through a conversational model of regulation. 178

It is possible to envisage mechanisms to facilitate such site-level accommodation in data protection law in two broad ways. 179 Such flexibility could come, firstly, through the interpretation of the law (A). Alternatively, or additionally, the law could be applied and enforced flexibly through graduated enforcement, applying insights from responsive regulation (B). These approaches are already evident to some extent in data protection law and practice yet, it is argued that without appropriate legislative underpinning and transparency regarding their application, they too risk jeopardising the attainment of the law’s objectives (C).

A. Flexible Interpretation: the Ad Hoc Rationalisation of the Law

The undesirable effects of an over-inclusive legal framework might be mitigated by interpreting the law in a ‘sensible’ or proportionate manner. Moreover, calls for such a ‘common sense’ approach to the interpretation of data protection law have been made from inside the Court. In Rīgas satiksme the Court was asked to consider whether data protection law provided legal grounds to compel the police to provide the personal information of an offender to a third party so that third party could initiate civil proceedings against the offender. 180 Specifically, the referring Court asked the CJEU to consider whether the legitimate interests legal basis—which enables data processing where necessary for the legitimate interests of the controller or of third parties provided such interests do not override the fundamental rights of the data subject—could be interpreted in this way. While the Court suggested this question should be answered in the affirmative, the Advocate General was more sceptical expressing a ‘certain intellectual unease as to the reasonable use and function of data protection rules’. 181 In the domestic proceedings leading to the case, the police—the data controllers—had refused the request on the basis, amongst others, that alternative options to access this information were available, leading to litigation and a referral to the national regulator. For the Advocate General, the application of data protection law in this context deviated from what he saw as the main concern of the law: namely, large-scale processing of personal data by mechanical, digital means. 182 He cautioned against their application in this context suggesting that such ‘“application absolutism” might result in discrediting the original idea’. 183 Instead, he suggested that when balancing interests under the law, a rule of reason ought to deployed necessitating a distinction between situations entailing large-scale mechanical processing and those where a ‘lighter touch’ is required. 184 While this has been interpreted as a call to introduce more flexibility and less formalism into the application of proportionality assessments under the data protection framework, 185 it could also be seen as a broader appeal for more flexibility in the law’s application outside the structures of proportionality assessments. It is noteworthy that the Advocate General refers to a rule of reason, rather than proportionality as such.

The challenges of introducing a dose of ‘common sense’, or site-level flexibility, to the law’s application are best illustrated by the Court’s designation of Google Search as a data controller and the subsequent jurisprudential contortions it has engaged in to ensure that Google’s Search operations can comply with the law. In Google Spain the Court concluded that Google Search was a data controller and was therefore responsible for ensuring its search engine activities were compliant with data protection law. In his Opinion, the Advocate General encouraged the Court to take into consideration proportionality, the objectives of the law and the means the law contains to achieve those objectives to reach a ‘balanced and reasonable outcome’. 186 His concern was that a search engine operator could not comply in law or in fact with the law’s provisions leading to the ‘absurd’ conclusion that a search engine could not be compatible with the law. 187 This concern had also been expressed by academic observers. 188 The Court was confronted with these concerns in the later case of GC and Others , which laid bare the mismatch between the operations of a search engine and the law’s requirements. At stake in GC was the prohibition on the processing of ‘special category’ personal data found in Article 9(1) GDPR. This provision reads as follows:

Processing of personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data, biometric data for the purposes of uniquely identifying a natural person, data concerning health or data concerning a natural person’s sex life or sexual orientation shall be prohibited.

This provision is clearly worded as a prohibition, which is then subject to a number of exceptions found in Article 9(2) GDPR, none of which readily apply to Google’s search engine activities. A literal interpretation of the law would therefore put Google’s search engine operations in direct conflict with the prohibition on sensitive data processing and render them illegal. As the rules on sensitive data processing are clearly linked to the fundamental rights of individuals, the inescapable conclusion would be that Google should cease or significantly alter its search engine operations.

In GC , the Court was asked to consider whether this prohibition applied to Google Search. The national referring court prefaced this question by asking whether the general prohibition also applies to search engines, ‘having regard to the specific responsibilities, powers and capabilities of the operator of the search engine’. 189 The inspiration for this qualification to controller duties came from the Court in Google Spain when it stated that a search engine operator must ensure that its activity complies with the law’s requirements ‘within the framework of its responsibilities, powers and capabilities’. 190 The meaning of this phrase, and in particular its ramifications for the responsibilities of controllers under data protection law, were left unexplored until GC and others .

In GC, the Court invoked this responsibilities formula to devastating effect. It began by emphasising that the prohibition applies to all kinds of processing by all controllers 191 and that an a priori exclusion of search engines from the prohibition would run counter to its ambition of enhanced protection for such rights-infringing processing. 192 Nevertheless, the Court went on to highlight the ‘specific features’ of a search engine which would have an effect on the extent of its responsibility under the law. 193 In particular, as the search engine operator is responsible as a data controller by linking to existing publications, the Court held that the prohibition ‘can apply to that operator only be reason of that referencing and thus via a verification, under the supervision of the competent national authorities, on the basis of a request by the data subject’. 194 The end result of GC is that the Court, relying on the responsibilities formula, maintained the fiction that the law applied to Google search in full, while interpreting a provision of the law clearly worded as a prohibition as a right. This ad hoc rationalisation of the law to accommodate Google’s business model not only goes against a literal interpretation of the provision but also contradicts the law’s general scheme. 195 The consequences of this approach will be elucidated below.

B. Flexible Enforcement: the Role of Regulatory Discretion

An alternative option to interpreting the law in a flexible manner would be to introduce flexibility at the point at which decisions regarding the enforcement of the law are made. Two distinct options present. Regulators might first exercise judgment in deciding which actions or complaints they will pursue. They might subsequently display further flexibility in determining how they deal with these cases.

The extent to which regulators can exercise this first-level flexibility in complaint handling under the GDPR is unclear. In other fields, the idea of risk-based regulation has taken root. This is a strategy which allows regulators to ‘prioritize how they consume their limited enforcement resources such that threats that pose the greatest risks to the regulator’s achievement of its institutional objectives are given the highest priority, while those that pose the least risk are allocated with few (if any) of the regulator’s limited resources’. 196 European data protection regulators are already prioritising their resources in this way. The Irish regulator, for instance, states that it applies a ‘risk-based regulatory approach to its work, so that its resources are always prioritised on the basis of delivering the greatest benefit to the maximum number of people’. However, while risk might be used to prioritise regulatory resources, it cannot be used as a criterion to exclude the handling of complaints entirely. The law requires regulators to ‘handle complaints … and investigate, to the extent appropriate, the subject matter of the complaint and inform the complainant of the progress and outcome of the investigation’. 197 Authorities have seemingly sought to stem the flow of complaints coming their way by indirectly imposing on individuals ‘preliminary actions or evidence requirements that do not directly derive from the GDPR’, calling into question their legality. 198 Yet, an authority cannot simply ignore a complaint or decline to deal with it as it is not a regulatory priority. 199 This is supported by the fact that data subjects have an explicit right to an effective judicial remedy against a regulator where the regulator ‘does not handle a complaint or does not inform the data subject within 3 months on the progress or outcome of the complaint’. 200 Nevertheless, authorities must only handle complaints ‘to the extent appropriate’. This suggests that they may inject discretion into the process at the second level of flexibility.

Flexibility in terms of the response of regulators to an infringement is in keeping with the idea of responsive regulation. Ayres and Braithwaite’s influential work queried when regulators should punish and when they should persuade. Their enforcement pyramid proposed that regulators begin at the pyramid’s base with persuasion moving up the pyramid to warnings and then penalties if the regulatory engagement did not have the desired effect. 201 Is such a tit-for-tat approach permitted under the GDPR? According to the Court in Schrems II , the primary responsibility of regulators is to monitor the application of the GDPR 202 and to ensure that it is ‘fully enforced with all due diligence’. 203 Data protection regulators, which are endowed by the Charter with ‘complete independence’ in the discharge of their duties, might argue that such complete independence enables them to tailor the approach they take in order to ensure the ‘full’ enforcement of the law. This might entail starting at the bottom of the enforcement pyramid by relying on persuasion before escalating up the pyramid to credible sanctions at the top where required. Some national laws, such as the Irish Data Protection Act of 2018, 204 expressly foresee the possibility of the amicable resolution of disputes.

However, other aspects of the law appear to place a greater constraint on regulatory discretion. The provisions on administrative sanctions suggest that they were not envisaged as part of an enforcement pyramid. The GDPR text provides that regulators shall ensure that the imposition of administrative fines is effective, proportionate and dissuasive in each individual case 205 while the non-binding recitals state that penalties including administrative fines ‘should be imposed for any infringement … in addition to, or instead of appropriate measures imposed by the supervisory authority’. 206 By way of exception, it specifies that for minor infringements or if the fine would constitute a disproportionate burden to a natural person, a reprimand may be issued instead of a fine. Erdos, for instance, claims that the GDPR therefore establishes a presumption that a national data protection authority will ‘at least take formal corrective action once cognisant of a significant infringement of data protection law’. 207 This seems also to be borne out by the wider text of the GDPR. The idea of amicable dispute resolution is mentioned only once in a recital and, only then, in the context of disputes that are localised because of their nature or impact. 208 We could conclude that, at a minimum, amicable resolution is inappropriate in the context of transnational disputes which might require cooperation between various concerned authorities. It is notable also that while data subjects have the right to challenge a regulator before a Court where it does not handle a complaint or where it issues a legally binding decision 209 this seems to leave a gap in situations where the complaint is handled but no legally binding decision is adopted. 210 Again, this suggests that the legislature did not foresee such flexible enforcement of the rules at scale. Beyond the doctrinal question of whether data protection law allows for the exercise of such site-level discretion, this discretion also raises broader normative challenges to which we shall now turn.

C. The Challenges of Site-Level Flexibility

In an ideal world, the ‘unreasonable and excessive legal consequences’ 211 of the broad scope of application of data protection law might be avoided or mitigated by interpreting and enforcing the law flexibly while continuing to offer effective and complete protection to individuals. The reality, however, is that site-level flexibility itself entails potential negative repercussions that must be addressed. Two negative consequences stand out: these concern the effectiveness and the quality of the law, respectively.

(i) The effectiveness of the law

The impact that the flexible interpretation and enforcement of data protection law will have on the law’s effectiveness remains uncertain. In GC the Court was left with a choice: to declare Google Search’s data processing, and therefore its business model, to be incompatible with the law or to accommodate the business model. The Court’s solution—treating an ex ante prohibition as an ex post right—does the latter: it is a bespoke interpretation of the law designed to accommodate a business model that does not fit the mould. It has been suggested that this finding provides a ‘safety valve’ against the disproportionate extension of data protection obligations to search engine operators. 212 Such accommodation might be justified on the basis of the societally beneficial role search engines play in organising the world’s information. 213 It was likely for this reason that the Advocate General considered that any finding of incompatibility with the law by search engines would be absurd. Yet, the relationship between law and technology in this instance is worth highlighting. The law is often simplistically characterised as seeking to keep up with technology, however, in GC we see that technological design impacts the interpretation and application of the law. 214 Specifically, the responsibilities formula deployed by the Court to rationalise the law’s application means that technologies that are designed in a way that renders data protection compliance impossible may avoid the law. It is thus no longer safe to assume that when there is personal data processing, ‘the entire body of the data protection guarantees applies’. 215 The Court’s approach is likely to embolden proponents of the ‘move fast and break things’ model of technological practices and design. We might, for instance, query whether data protection rights such as the right to delete can be exercised on an immutable decentralised ledger technology such as blockchain 216 or whether a tool like ChatGPT could avoid ex ante or ex post data protection requirements as they are not commensurate with the ‘powers, capabilities and responsibilities’ of the relevant data controllers. In short, the risk is that the responsibilities formula creates an incentive for technologists to circumvent the law through design, a scenario that almost certainly militates against effective data protection. 217

Nor is it clear that the flexible enforcement of the law will yield more effective data protection. While it is generally acknowledged that the success of data protection law should not be measured using a crude assessment such as the number of fines issued, 218 this is in part because the law offers a broader array of corrective powers that regulators can draw on, such as a ban on data processing operations, that may have an equally, if not more significant effect, than fines. 219 Evidence to date indicates that European data protection regulators have made limited use of the full palette of corrective powers. 220 If flexible enforcement, anchored in the enforcement pyramid, secured the more effective application of data protection law, a purposive interpretation of the law would support its application. However, we lack the empirical evidence needed to assess whether flexible enforcement leads to more effective protection. In situations where the overall level of formal enforcement drops dramatically due to a regulatory preference for informal interactions between regulators and regulatees, doubts arise as to the impact of the law in practice. For instance, in the UK although the regulator ‘handled’ 40,000 data subject complaints in the 2021–2022 period only four fines were issued for breach of the GDPR totalling £663,000 in total. 221 No other enforcement notices or penalties were issued. Some of the examples of situations where the regulator opted not to use its formal enforcement powers are striking. For instance, the Information Commissioner’s Office (ICO) did not impose an administrative sanction on two police forces that surreptitiously recorded and stored over 200,000 phone conversations involving victims, witnesses and perpetrators of suspected crimes as part of its revised approach towards the public sector. 222 We might legitimately query in these circumstances whether informal enforcement is delivering effective fundamental rights protection.

(ii) The quality of the law

The flexible interpretation and application of the law is difficult to square with some of the core qualities of law that ensure its internal morality, including that law be general, publicly promulgated and that there be congruence between official action and declared rule. 223 This is particularly important in the data protection context where the foreseeability of the law is a requirement to justify interferences with fundamental rights 224 while the foreseeability of data processing operations is central to garnering public trust in processing and technology. 225

The data protection framework is ‘all or nothing’ in so far as it applies when the data processed is personal but not to non-personal data. 226 However, it has arguably never been accurate to characterise the data protection framework as a one-size-fits-all model, or an ‘intensive and non-scalable regime of rights and obligations’ 227 due to the existence of the general principle of proportionality and the introduction of risk-management obligations. These already introduce a significant degree of flexibility into how the law is interpreted. For instance, Gellert observes that while the GDPR provides some guidance to data controllers regarding potential sources of risk (toxicological factors) it leaves the consequences and harms (epidemiological factors) as well as the methodologies for assessing harms undelineated to a large extent. 228 However, the use of the responsibilities formula marks a qualitative shift in the law’s flexibility. 229 While some may welcome a doctrine that enables the application of the law to be calibrated to the powers of the data controller, 230 this must be set against the uncertainty that this formula introduces about how the rules apply to whom. Unlike other elements of the legal regime which also introduce elements of scalability, such as the provisions introducing risk-management requirements, the application of this formula comes with no guidance or legislative footing. Quelle suggests that this gap could be filled by applying the responsibilities formula with reference to risk. 231 While this may help to anchor the application of the responsibilities formula more firmly to the text of the GDPR in some circumstances, it would not be helpful when interpreting provisions where there is no reference made to risk. The result will be the further unpredictability of the regime’s application to the detriment of not only its effectiveness but also its transparency and predictability.

Moreover, while the ‘rule of reason’ applied by the Court might be likened to the principle of proportionality, proportionality analysis does not feature explicitly at all in the Court’s reasoning. Like the application of the rule of reason in competition law, where a restriction on competition was removed from the scope of competition law as this restriction was inherent in the pursuit of public policy objectives, this might be characterised as ‘bold and innovative or unprincipled and misconceived’ 232 depending on one’s perspective. More generally, the extent of the role that proportionality could play in introducing flexibility to the law’s application remains ambiguous. If the data protection framework is correctly characterised as a justificatory framework for data processing that interferes with fundamental rights, then the provisions of the GDPR and their interpretation should embody the principle of proportionality. Primarily through the jurisprudence of the Court, proportionality has emerged as a ‘data privacy principle in its own right’ with some viewing it as being ‘at the core of the GDPR’s structure’. 233 While the data protection principles do not explicitly include proportionality, it is said to underpin them and ‘shines through in their interstices’. 234 Proportionality therefore potentially offers a more rigorous tool through which to introduce flexibility into the data protection framework. This, however, depends on how the proportionality principle is applied. The Court has, for instance, on occasion replaced an assessment of whether data processing was compatible with the specific provisions of the GDPR with a more general assessment of whether the processing was compatible with the principle of proportionality, grounding its reasoning directly in the EU Charter rights to data protection and to respect for private life. 235 Regulators are more likely than Courts to engage in a more loyal and specific application of the law’s provisions than to replace their application with a broader proportionality analysis, as the Court did in this case. Moreover, while some provisions of the law lend themselves readily to proportionality analysis, 236 notably the principles found in Article 5 GDPR, many of the law’s other ex ante requirements, such as transparency obligations and the abovementioned prohibition on special category data processing, are less amenable to proportionate interpretation. The appropriate role of this principle in calibrating the application of data protection law, and its relationship with the risk requirements introduced by the GDPR, requires further research and consideration.

The compatibility of responsive regulatory enforcement with rule of law requirements has received surprisingly little attention. 237 The complete independence of data protection authorities dictates that these regulators exercise their powers free from internal and external influence. However, some accountability mechanisms must exist if regulators fail to discharge their primary responsibility of enforcing the law. 238 The status quo also does nothing to prevent zealous application of the law, such as fining individuals for the positioning of their home or business surveillance cameras or for posting content filming public disorder incidents on social media. 239 The transparency of the criteria applied in deploying the enforcement pyramid will be critical in this regard. 240 For instance, the ICO has adopted a revised approach towards the public sector, where it has opted to use its discretion to reduce the impact of fines on public sector operators. Pursuant to this approach, the ICO will rely on powers to warn, reprimand and issue enforcement notices, with fines only handed down in the ‘most serious cases’. 241 However, the example mentioned above of the covert recording of conversations by the police where no fine was issued begs the question of what the ICO considers to be a ‘serious case’. More broadly, empirical evidence suggests that where regulators have adopted a strategic approach to enforcement this has neither been calibrated to the extent to which the data controllers demonstrated compliance with relevant legal requirements nor systematically assessed against the overarching requirement to achieve effective and complete protection of data subjects. 242

In the absence of clear and transparent criteria guiding the enforcement of the law, the ensuing regulatory roulette offends against the equal protection and application of the law to the detriment of its beneficiaries—individuals in the first instance but ultimately society. Moreover, it may be inappropriate to apply the ‘conversational approach’ to the enforcement of the law, found at the bottom of the enforcement pyramid, in some circumstances. These includes where the stakes are high (such as in situations where there is a risk of irreversible harm); where there are no repeated interactions with regulatees; or where the regulatee is reluctant to comply. 243

Data protection law faces mounting criticism, both from human rights scholars and activists and from those who treat it as an unnecessary impediment to boundless data processing and the claimed innovation this would entail. Despite the technological developments during its lifespan, it has proven to be a resilient and adaptable legal framework, most recently acting as a first brake on the deployment of generative AI in ways that violate fundamental rights. The expansive interpretation of responsibility under the law has already yielded some benefits. Equally, however, many of the challenges that the law faces stem from its application, not to everything, but to everyone. While we could think of data protection as a broad church, it has also been characterised (perhaps more accurately) as an indiscriminate obsession. 244 Thinking about the law’s future, we could be pulled in different directions. On the one hand, it is challenging to interpret the law in way that adheres to different contexts while, on the other, its broad application puts regulators under pressure with rising numbers of complaints which they have an imperative to handle. The judicial response has been to overlook these problems, or to simply patch them by rationalising the law’s application in an ad hoc manner.

Turning to the future, the possibility of using increased site-level flexibility must be further explored and the rule of law challenges it entails addressed. This can be done by the EDPB without legislative change under the auspices of the GDPR. More broadly, however, it is clear that the current lack of empirical assessment of how the law applies in practice ‘leaves legal reformers shooting in the dark, without a real understanding of the ways in which previous regulatory attempts have either promoted or thwarted privacy’s protection’. 245 Recognising that no law is ever fully enforced, what is required for data protection is agreement on an appropriate standard against which to gauge regulatory effectiveness. Determining an appropriate balance between data protection compliance and data protection enforcement will be necessary. Finally, and perhaps most ambitiously, the purposes of data protection law need to be further specified by the Court. A starting point may be to disentangle the intersecting demands of informational privacy from those of fair information governance. 246

This may seem like an uphill battle. Data protection pioneer, Spiros Simits, spoke of data protection as an ‘impossible task’. 247 However, Simitis also saw data protection as an ‘unending learning process’ necessitating a ‘continuous critical review of the regulatory approach’ to ensure its efficiency. 248 It is in this spirit that the challenge of securing effective fundamental rights protection in the digital era should be approached.

Associate Professor, LSE Law School and Visiting Professor, College of Europe Bruges, Belgium. E-mail: [email protected]. I am very grateful to the Editors of Current Legal Problems for the invitation to contribute to this series, with particular thanks to Despoina Mantzari for her guidance throughout. My thanks also to the anonymous referees for their valuable comments and to Mr Wojciech Wiewiórowski, the European Data Protection Supervisor, for his generosity in attending and chairing the lecture. I benefited from helpful feedback on earlier drafts of this text from Gloria Gonzalez Fuster, Hielke Hijmans, Filippo Lancieri, Rotem Medzini, Katherine Nolan and Thomas Streinz. All views, and any errors, remain my own.

Bilyana Petkova, ‘Privacy as Europe’s First Amendment’ (2019) 25 European Law Journal 140.

For instance, in Google Spain the Court held that ‘as a general rule’ the data subject’s rights to data protection and to respect for private life override the interests of internet users in access to information (Case C-131/12, Google Spain SL and Google Inc. v Agencia Española de Protección de Datos (AEPD) and Mario Costeja González EU:C:2014:317, para 81).

Gloria Gonzalez Fuster and Hielke Hijmans, ‘The EU Rights to Privacy and Personal Data Protection: 20 Years in 10 Questions’, VUB Discussion Paper (2019) https://cris.vub.be/ws/portalfiles/portal/45839230/20190513.Working_Paper_Gonza_lez_Fuster_Hijmans_3_.pdf .

Anu Bradford, The Brussels Effect: How the European Union Rules the World (OUP 2020) 132; Anu Bradford, ‘The Brussels Effect’ (2012) 107 Nw U L Rev 1, 22–26. The Council of Europe’s Convention 108 is also a highly influential instrument and a likely standard for global convergence; Global Privacy Assembly, ‘Privacy and Data Protection as Fundamental Rights – A Narrative’ https://globalprivacyassembly.org/wp-content/uploads/2022/03/PSWG3-Privacy-and-data-protection-as-fundamental-rights-A-narrative-ENGLISH.pdf , 48–50.

Regulation (EU) 2022/1925 of the European Parliament and of the Council of 14 September 2022 on contestable and fair markets in the digital sector and amending Directives (EU) 2019/1937 and (EU) 2020/1828 (Digital Markets Act) (Text with EEA relevance) OJ [2022] L265/1.

Regulation (EU) 2022/2065 of the European Parliament and of the Council of 19 October 2022 on a Single Market For Digital Services and amending Directive 2000/31/EC (Digital Services Act) (Text with EEA relevance) OJ [2022] L277/1.

Ibid, Article 2(4)(g) and recital 10. This also follows from recital 12 and Article 8(1) Digital Markets Act (n 5).

Peter Hustinx, ‘The Role of Data Protection Authorities’ in Serge Gutwirth et al. (eds), Reinventing Data Protection (Springer 2009) 131, 133.

From within the Court see, for instance, Case C-245/20, X, Z v Autoriteit Persoonsgegevens ECLI:EU:C:2021:822, Opinion of AG Bobek, paras 55–56. Nadezhda Purtova, ‘The Law of Everything. Broad Concept of Personal Data and Future of EU Data Protection Law’ (2018) Law, Innovation and Technology 40; Bert-Jaap Koops, ‘The Trouble with European Data Protection Law’ (2014) 4 International Data Privacy Law 250.

Colin J. Bennett and Robin M. Bayley, ‘Privacy Protection in the Era of “Big Data”: Regulatory Challenges and Social Assessments’ in Bart van der Sloot, Dennis Broeders and Erik Schrijvers (eds), Exploring the Boundaries of Big Data (Amsterdam University Press 2016) 205, 210.

Raphaël Gellert, The Risk-Based Approach to Data Protection (OUP 2020), 186.

Luca Tosoni, ‘Article 4(6): Filing System’ in Christopher Kuner, Lee A Bygrave, Christopher Docksey and Laura Drechsler (eds), The EU General Data Protection Regulation (GDPR): A Commentary (OUP 2020) 138, 141.

The expansive approach to the territorial application of the GDPR is justified on the same grounds but is beyond consideration of the jurisdictional reach of the rules is beyond the scope of this article. On jurisdictional issues see, Merlin Gömann: ‘The New Territorial Scope of EU Data Protection Law: Deconstructing a Revolutionary Achievement’ (2017) Common Market Law Review 567.

Before the enactment of the GDPR Erdos remarked that its ‘almost unfathomable scope, inflexible nature and sometimes unduly onerous default standards’ are ill suited to digital realities, recommending a more radical shift of focus and balance in the law. David Erdos, European Data Protection Regulation, Journalism, and Traditional Publishers: Balancing on a Tightrope? (OUP 2019) 146.

Colin Bennett, Regulating Privacy: Data Protection and Public Policy in Europe and the United States (Cornell University Press 1992).

Articles 5 and 6 GDPR.

Articles 12–22 GDPR.

Claudia Quelle, ‘Enhancing Compliance under the General Data Protection Regulation: The Risky Upshot of the Accountability- and Risk-based Approach’ (2018) 9 European Journal of Risk Regulation 502; Reuben Binns, ‘Data Protection Impact Assessments: A Meta-regulatory Approach’ (2017) 7 International Data Privacy Law 22.

On the phenomenon of legislative instruments giving expression to fundamental rights in equality and data protection law see Elise Muir, EU Equality Law: The First Fundamental Rights Principle of the EU (OUP 2018) 137–143.

Joined Cases C-293/12 and 594/12, Digital Rights Ireland Ltd and Seitlinger and others EU:C:2014:238, para 36. See also C-311/18, Data Protection Commissioner v Facebook Ireland Limited and Maximillian Schrems EU:C:2020:559, para 170; Opinion 1/15, ECLI:EU:C:2016:656, para 123.

The Court has conceptualised the application of the right to data protection in this way, however, the content and application of the right remain contested. See, González Fuster and Hijmans (n 3).

While it remains possible to envisage daily activities that do not entail personal data processing, such as riding a bicycle or reading a book, a digital component is now introduced to many of our activities (such as the digital transactions required to rent a bike in a city or the use of an e-reader to read books).

This term was coined by Purtova in her influential article ‘The Law of Everything’ (n 9).

Damian George, Kento Reutimann and Aurelia Tamò-Larrieux, ‘GDPR Bypass by Design? Transient Processing of Data Under the GDPR’ (2019) 9 International Data Privacy Law 285.

Article 4(6) GDPR defines a ‘filing system’ as ‘any structured set of personal data which are accessible according to specific criteria, whether centralised, decentralised or dispersed on a functional or geographical basis’.

The GDPR recognises a category of pseudonymous data but this is still categorised as personal data (Article 4(5) GDPR).

Recital 26 GDPR. Article 4(3)(b) GDPR defines pseudonymisation.

See, for instance, the examples recognised in the Court’s jurisprudence referred to by Wachter and Mittelstadt in Sandra Wachter and Brett Mittelstadt, ‘A Right to Reasonable Inferences: Re-thinking Data Protection Law in the Age of Big Data and AI’ (2019) Columbia Business Law Review 1, 30–31.

Article 4(1) GDPR.

Paul Ohm, ‘Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization’ (2009) 57 UCLA Law Review 1701; Michèle Finck and Frank Pallas, ‘They Who Must Not Be Identified—Distinguishing Personal from Non-personal Data Under the GDPR’ (2020) 10 International Data Privacy Law 11; Nadezdha Purtova, ‘From Knowing by Name to Targeting: The Meaning of Identification Under the GDPR’ (2022) 12 International Data Privacy Law 163.

See, Taylor and Francis, ‘What Are the Different Types of Peer Review?’ https://authorservices.taylorandfrancis.com/publishing-your-research/peer-review/types-peer-review/ ; or OUP, ‘Five Models of Peer Review: A Guide’ (23 September 2021) https://blog.oup.com/2021/09/five-models-of-peer-review-a-guide/ .

Anonymity in this context serves the purpose of limiting the risk of bias in the evaluation procedure (as distinct from under the GDPR where it serves to determine the law’s scope of application).

Case C-434/16, Nowak v Data Protection Commissioner EU:C:2017:994, para 34. One might argue that the article itself is simply data—a source of information—that needs to be read to reveal information about the individual, however, the Court has not, as of yet, made this distinction between data and information.

Durant v Financial Services Authority [2003] EWCA Civ 1746.

Nowak (n 33) para 35.

ibid, paras 37–39.

ibid, para 43.

ibid, para 34.

Case C-582/14, Patrick Breyer v Bundesrepublik Deutschland ECLI:EU:C:2016:779 para 43 confirms that it is not necessary that the information enabling identification be in the hands of one entity. However, for such data to constitute identifiable information it must also be assessed whether the data combination is a means reasonably likely to be used to identify an individual (para 45).

See, for instance, Articles 17(4) and 85 GDPR. The balancing of data protection and related rights and freedom of expression must therefore occur within the data protection framework.

Case C-245/20, X, Z v Autoriteit Persoonsgegevens, Opinion of AG Bobek (n 9) paras 56 and 57.

Purtova, ‘The Law of Everything’ (n 9) 57.

Bart van der Sloot, ‘Truth from the Sewage: Are We Flushing Privacy Down the Drain?’ (2021) 12 European Journal of Law and Technology https://ejlt.org/index.php/ejlt/article/view/766

Salóme Viljoen, ‘A Relational Theory of Data Governance’ (2021) 131 Yale L J 573.

Michal S. Gal and Orla Lynskey, ‘Synthetic Data: Legal Implications of the Data-Generation Revolution’ (forthcoming) Iowa Law Review 2023; LSE Legal Studies Working Paper No. 6/2023.

Lorenzo Dalla Corte, ‘Scoping Personal Data: Towards a Nuanced Interpretation of the Material Scope of EU Data Protection Law’ (2019) 10 European Journal of Law and Technology https://ejlt.org/index.php/ejlt/article/view/672 .

Article 2(2)(b), (a) and (d) GDPR, respectively.

Article 2(3) GDPR.

Article 2(2)(c) GDPR.

Case C-101/01 Bodil Lindqvist [2003] ECR I-12971, para 39.

ibid, para 47.

Case C-212/13 František Ryneš v Úřad pro ochranu osobních údajů EU:C:2014:2428.

Article 29 Data Protection Working Party, ‘Opinion 1/2010 on the concepts of “controller” and “processor”’ WP169, adopted on 16 February 2010, 25. This Opinion was superseded by European Data Protection Board (EDPB) Guidelines. EDPB, ‘Guidelines 07/2020 on the concepts of controller and processor in the GDPR’ version 1.0, adopted on 2 September 2020, 8.

Article 4(7) GDPR.

Article 4(8) GDPR.

For instance, the processor is under a general obligation to ensure that appropriate technical and organisational measures are in place to ensure the processing complies with the Regulation and that any sub-processors it engages comply with the terms of the original contract with the controller (Article 28 GDPR).

Opinion 1/2010 (n 54) 2.

René Mahieu, Joris van Hoboken and Hadi Asghari, ‘Responsibility for Data Protection in a Networked World on the Question of the Controller, “Effective and Complete Protection” and its Application to Data Access Rights in Europe’ (2019) 10 Journal of Intellectual Property , Information Technology and Electronic Commerce Law 85, 87.

Heleen Janssen, Jennifer Cobbe, Chris Norval and Jatinder Singh, ‘Decentralized Data Processing: Personal Data Stores and the GDPR’ (2020) 10 International Data Privacy Law 356.

Brendan Van Alsenoy, ‘Allocating Responsibility among Controllers, Processors, and “Everything in Between”: The Definition of Actors and Roles in Directive 95/46/EC’ (2012) Computer Law & Security Review 25, 27.

EDPB Guidelines (n 54) 3 and 19.

Article 4(7) GDPR. Data controllers and data processors also benefit from procedural rights, such as the right to lodge a complaint with a supervisory authority: Article 77(1) GDPR.

The GDPR does recognise the specific needs of micro, small- and medium-sized enterprises to some extent in several recitals (recitals 13, 98, 137 and 167). It provides that their specific needs should be taken into account when codes of conduct are drawn up to contribute to the Regulation’s proper application and when certification measures are introduced, although neither codes of conduct nor certification have been widely adopted so far (Articles 40 and 42 GDPR).

The jurisdictional component of this case was also notable. The Court had held that although Google Inc., the parent company responsible for the coordination of Google’s data processing operations was established in the USA, the presence of a subsidiary in Spain selling advertising to cross-subsidise these operations was sufficient to bring the processing within the scope of EU data protection law. Google Spain (n 2) para 55.

Article 29 Working Party, ‘Opinion 1/2008 on data protection issues related to search engines’, adopted on 4 April 2008 WP148, 14.

Google Spain (n 2) para 22.

ibid, para 33.

ibid, paras 34 and 38.

Mahieu and von Hoboken note that the Court is more concerned with ensuring effective and complete protection ‘than a more literal interpretation of the law’s text would seem to point to’. René Mahieu and Joris von Hoboken, ‘Fashion ID: Introducing a Phase-oriented Approach to Data Protection?’ ( European Law Blog , 30 September 2019).

Case C-210/16, Unabhängiges Landeszentrum für Datenschutz Schleswig-Holstein v Wirtschaftsakademie Schleswig-Holstein GmbH (Facebook fan pages) ECLI:EU:C:2018:388, para 39.

ibid, para 35.

ibid, para 36.

This was a reasonable assumption based on the way in which the Court set out its reasoning. For instance, Mahieu et al (n 59, 94) were critical of the Court’s decision in Facebook fan pages stating that ‘it seems unreasonable that if Facebook would not offer the so-called Insights function, the fan page administrator would no longer have responsibility for the data processing’.

Case C-40/17, Fashion ID GmbH & Co.KG v Verbraucherzentrale NRW eV ECLI:EU:C:2019:629, para 75.

ibid, para 76. The Court noted that it seemed ‘impossible’ that Fashion ID determines the purposes and means of these subsequent processing operations.

ibid, para 77.

ibid, para 78.

ibid, para 79.

ibid, para 80.

ibid, para 82. Facebook fan pages (n 72) para 38.

This was noted by the Advocate General in Fashion ID who considered that taken to extremes this makes anyone in a ‘personal data chain’ who makes data processing possible a controller. Case C-40/17, Fashion ID GmbH & Co. KG v Verbraucherzentrale NRW eV ECLI:EU:C:2019:629, Opinion of AG Bobek, para 74.

Case C-25/17, Jehovan todistajat EU:C:2018:551.

ibid, para 16.

ibid, para 23.

ibid, para 73.

The EDPB distinguishes between essential means of processing (which is closely linked to purposes) and includes determining what and whose personal data is processed and for long, and non-essential means which concerns more practical aspects of implementation (e.g. Hardware choices). EDPB Guidelines (n 54) 14.

Jehovan (n 84) para 69.

ibid, para 67.

C-604/22, IAB Europe v Gegevensbeschermingsautoriteit (application pending).

Lee A. Bygrave and Luca Tosoni, ‘Article 4(7): Controller’ in Christopher Kuner, Lee A Bygrave, Christopher Docksey and Laura Drechsler (eds), The EU General Data Protection Regulation (GDPR): A Commentary (OUP 2020) 145, 152.

EDPB Guidelines (n 54) 13.

R v Department of Health; ex parte Source Informatics Ltd [2000] 1 All ER 786, para 799. In R v Department of Health the UK Court of Appeal held obiter dicta that the process of anonymising personal data did not qualify as a form of ‘processing’ under the 1998 DPA.

Durant (n 34).

Case C-28/08P, European Commission v The Bavarian Lager Co. Ltd ECLI:EU:C:2009:624, Opinion of AG Sharpston, paras 144–146.

ibid, para 146.

ibid, paras 137 and 139.

It is perhaps also notable that the Advocate General took a holistic approach to ‘processing’ viewing the processing operation as a composite whole: the she looked at the overall process of retrieving a legally contested digital document as opposed to a series of smaller, distinct processing operations.

Instead, the Court simply endorsed the General Court’s finding that the ‘communication of data, by transmission, dissemination or otherwise making available, falls within the definition of processing’. Case C-28/08P European Commission v The Bavarian Lager Co. Ltd [2010] ECR I-06055, para 69; endorsing [105] in T-194/04 The Bavarian Lager Co. Ltd v Commission [2007] ECR II-04523.

Case C-141/12 YS v Minister voor Immigratie, Integratie en Asiel and Minister voor Immigratie, Integratie en Asiel v M and S ECLI:EU:C:2020:753, Opinion of AG Sharpston, para 63.

ibid, para 56.

ibid, paras 58 and 59.

ibid, paras 55.

Case C-141/12 YS v Minister voor Immigratie, Integratie en Asiel and Minister voor Immigratie, Integratie en Asiel v M and S ECLI:EU:C:2020:753, para 40.

ibid, para 41.

ibid, paras 42–46.

Orla Lynskey, ‘Criminal Justice Profiling and EU Data Protection Law: Precarious Protection from Predictive Policing’ (2019) 15 International Journal of Law in Context 162, 169. This finding was likely influenced by a desire to avoid undermining established principles of administrative law, like freedom of information, in Member States.

Lee A. Bygrave and Luca Tosoni, ‘Article 4(1): Personal Data’ in Christopher Kuner, Lee A Bygrave, Christopher Docksey and Laura Drechsler (eds), The EU General Data Protection Regulation (GDPR): A Commentary (OUP 2020) 103, 110.

Nowak (n 33) para 46.

ibid, para 49.

Bygrave and Tosoni, ‘Article 4(1)’ (n 109) 110.

Facebook fan pages (n 72) para 24(1).

This aligns to the findings of the Supreme Court of Milan which held that as long as the illicit data is unknown to the service provider it cannot be a data controller. Giovanni De Gregorio, Digital Constitutionalism in Europe: Reframing Rights and Powers in the Algorithmic Society (Cambridge Studies in European Law and Policy, CUP 2022), 138.

Case C-131/12, Google Spain SL and Google Inc. v Agencia Española de Protección de Datos (AEPD) and Mario Costeja González EU:C:2014:317, Opinion of AG Jääskinen, para 83

ibid, para 82.

Case C-40/17, Fashion ID , Opinion of AG Bobek (n 83), para 71.

ibid, para 72.

Spiros Simitis, ‘Legal and Political Context of the Protection of Personal Data and Privacy’ (Speech in Montreal, September 1997) Council of Europe Archives (T-PD (97) 17—on file with the author), 7.

Case C-245/20, X, Z v Autoriteit Persoonsgegevens, Opinion of AG Bobek (n 9) para 65.

Bygrave and Tosoni, ‘Article 4(1): Personal Data’ (n 109) 113. We will return to the distinction between compliance and enforcement below.

Katherine Nolan. The Individual in EU Data Protection Law (PhD thesis; LSE Law School), 130.

Koops (n 9) 251.

Irish Data Protection Commission, ‘Annual Report 2021’, 5.

Gráinne de Búrca, Reframing Human Rights in a Turbulent Era (OUP 2021), 46.

Adam Satariano, ‘Europe’s Privacy Law Hasn’t Shown Its Teeth, Frustrating Advocates’ New York Times (27 April 2020), https://www.nytimes.com/2020/04/27/technology/GDPR-privacy-law-europe ; Johnny Ryan and Alan Toner, ‘Europe’s Governments are Failing the GDPR’ (Brave Report 2020).

Impact Assessment, ‘Commission Staff Working Document accompanying SEC(2012) 72 final’, Brussels (2 January 2012), 103.

Access Now, ‘The right to lodge a data protection complaint: OK, but then what? An empirical study of current practices under the GDPR’, June 2022. More generally, it noted that ‘there is a lack of precise information on complaint-handling, including on the number of complaints lodged with DPAs’ (ibid, 4).

The number of complaints received could also be an indicator of the relevance and visibility of the law to individuals.

Google Transparency Report, ‘Requests to Delist Content Under European Privacy Law’, https://transparencyreport.google.com/eu-privacy/overview?hl=en-GB .

Julia Powles, ‘The Case That Won’t Be Forgotten’ (2015) 47 Loy U Chi LJ 583. See also, Julia Powles and Enrique Chaparro, ‘How Google Determined Our Right to Be Forgotten’, The Guardian (18 February 2015).

Julia Black, Rules and Regulators (OUP 1997), 9.

Article 1(2) and (3) GDPR. Chapter V GDPR subjects data flows to outside the EU to distinct legal requirements to ensure that the level of protection individuals receive when the data is transferred out of the EU is ‘essentially equivalent’ to within the EU to prevent the circumvention of the data protection framework.

Macenaite, for instance, considers the aims of developing a data-driven economy and protecting fundamental rights and freedoms to be essentially contradictory while Yakovleva envisages their reconciliation. Milda Macenaite, ‘The “Riskification” of European Data Protection Law through a Two-Fold Shift’ (2017) 8 European Journal of Risk Regulation 506, 507; Svetlana Yakovleva, ‘Personal Data Transfers in International Trade and EU Law: A Tale of Two Necessities’ (2020) 21 Journal of World Investment & Trade 881, 888.

Kristina Irion, ‘A Special Regard: The Court of Justice and the Fundamental Rights to Privacy and Data Protection’ in Ulrich Faber et al (eds), Gesellschaftliche Bewegungen - Recht unter Beobachtung und in Aktion: Festschrift für Wolfhard Kohte (Nomos 2016) 873. This was foreseen by Spiros Simitis, ‘From the Market to the Polis: The EU Directive on the Protection of Personal Data’ (1994–1995) 80 Iowa Law Review 445.

Plixavra Vogiatzoglou and Peggy Valcke, ‘Two Decades of Article 8 CFR: A Critical Exploration of the Fundamental Right to Personal Data Protection in EU Law’ in Eleni Kosta, Ronald Leenes and Irene Kamara (eds), Research Handbook on EU data protection (Edward Elgar 2022). See also, Gonzalez Fuster and Hijmans (n 3).

C-60/22, UZ v Bundesrepublik Deutschland ECLI:EU:C:2023:373, para 65.

The Court has not explicitly confirmed that the GDPR ‘gives expression’ to the right to data protection, which might result in a self-referential system whereby the right to data protection is interpreted in light of secondary law. Nadezhda Purtova, ‘Default Entitlements in Personal Data in the Proposed Regulation: Informational Self-determination Off the Table … and Back On Again?’ (2014) 30 Computer Law & Security Review 6, 11.

This echoes Koops’ earlier observation that ‘we see data protection bodies moving all around, but they do not provide us with real protection’. Koops (n 9) 259.

Neil Richards, Why Privacy Matters (OUP 2022) 52.

Ari Ezra Waldman, Industry Unbound: The Inside Story of Privacy, Data and Corporate Power (CUP 2021), 114. This echoes the findings of Black in the field of financial services regulation where she refers to ‘creative compliance’. Julia Black, ‘Learning from Failures: “New Governance” Techniques and the Financial Crisis’ (2012) 75 Modern Law Review 1037.

Filippo Lancieri, ‘Narrowing Data Protection’s Enforcement Gap’ (2022) 74 Maine Law Review 15, Appendix: 65–72.

Lancieri cites Sanchez-Rola et al. to this effect. See, Iskander Sanchez-Rola et al., ‘Can I Opt Out Yet? GDPR and the Global Illusion of Cookie Control’ (2019) Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security 1, 3–5.

Hodges advocates that effective data protection requires ‘a system of constructive engagement in resolving problems, involving relationships based on evidence of trust’ between regulators and businesses. Christopher Hodges, ‘Delivering Data Protection: Trust and Ethical Culture’ (2018) 1 European Data Protection Law Review 65, 79.

Hielke Hijmans, ‘How to Enforce the GDPR in a Strategic, Consistent and Ethical Manner? A Reaction to Christopher Hodges’ (2018) 1 European Data Protection Law Review 80, 82.

Fuller refers to ‘rules that require conduct beyond the powers of the affected party’. Lon L. Fuller, The Morality of Law (Revised edn, Yale University Press 1969), 39.

Facebook fan pages (n 72) para 38; Jehovan (n 84) para 69.

See https://www.facebook.com//legal/controller_addendum accessed 23 August 2023.

Articles 12–14 GDPR.

Case C-40/17, Fashion ID , Opinion of AG Bobek (n 83) para 84.

EDPB, ‘Report of the Work Undertaken by the Supervisory Authorities within the 101 Task Force’ (28 March 2023), 10.

There is emerging consensus that there are structural impediments to its effective enforcement. For instance, the European Data Protection Supervisor hosted a conference in May 2022 on data protection enforcement to make progress on this issue. European Data Protection Supervisor, Effective enforcement in the digital world, June 2022. https://www.edpsconference2022.eu/en .

This was the case in Facebook Fanpages (n 72).

Michael Veale, Midas Nouwens and Cristiana Teixeira Santos, ‘Impossible Asks: Can the Transparency and Consent Framework Ever Authorise Real-Time Bidding After the DPA Decision?’ (2022) Technology and Regulation 12.

On the enhancement of the role of civil society actors and public regulators in this space see Lancieri, ‘Data Protection’s Enforcement Gap’ (n 142) 57–60.

Irish Data Protection Commission, ‘Annual Report 2018’, 5.

EDPB, ‘Overview on Resources Made Available by Member States to the Data Protection Authorities and on Enforcement Actions by the Data Protection Authorities’ (5 August 2021), 10. However, in its study on complaints Access Now notes that what can be gleaned from such figures is limited due to disparities in what is treated as a complaint and the handling of complaints at national level. Access Now (n 128), 4.

ibid, 5. With the exception of Germany which has over 1000 employees, all other regulators had fewer than 300 employees in 2021 (ibid).

Niamh Dunne, ‘Knowing When to See It: State Activities, Economic Activities, and the Concept of Undertaking’ (2010) 16 Colum J Eur L 427.

J Black, Rules and Regulators (n 132) 30. This is in keeping with later work describing regulation as a ‘communicative process’. See, Julia Black, ‘Regulatory Conversations’, (2002) 29 Journal of Law and Society 163, 164.

Ian Begley, ‘Office of Data Protection Commissioner Says GPO Can Keep their Bins as Public Litter Is Not in Breach of GDPR rules’, Irish Independent (2 May 2019). https://www.independent.ie/irish-news/office-of-data-protection-commissioner-says-gpo-can-keep-their-bins-as-public-litter-is-not-in-breach-of-gdpr-rules-38073828.html .

Chris Reed, ‘The Law of Unintended Consequences – Embedded Business Models in IT Regulation’ (2007) Journal of Information, Law and Technology 33, 9 (noting the law’s ‘implicit assumption that there is central control of personal data processing’).

Article 5(2) and 30 GDPR.

Rotem Medzini, ‘Credibility in Enhanced Self-regulation: The Case of the European Data Protection Regime’ (2021) Policy & Internet 13(3) 366.

Nolan (n 122) 37.

Article 30(5) GDPR contains a derogation from the requirement to maintain a record of processing activities for SMEs, however, Article 25 on data protection by design and by default contains no such exceptions.

See, for instance, Autorité de Protection des Données, ‘The BE DPA to restore order to the online advertising industry: IAB Europe held responsible for a mechanism that infringes the GDPR’, Press Release (2 February 2022); Decision of the litigation chamber, Case number: DOS-2019-01377 (2 February 2022).

EDPB, ‘Guidelines 04/2020 on the Use of Location Data and Contact Tracing Tools in the Context of the COVID-19 Outbreak’ (21 April 2020); EDPB, ‘Guidelines 03/2020 on the Processing of Data Concerning Health for the Purpose of Scientific Research in the Context of the COVID-19 Outbreak’, 21 April 2020.

The French regulator (the CNIL) received 14,143 complaints in 2021 and responded to a further 33,329 phone calls and 16, 898 contacts by e-mail in 2021 with advice and information (representing a 39% increase on 2020). Commission National Informatique et Libertés (CNIL), ‘The CNIL in a Nutshell 2022’, 4.

Dr Mary Fairhurst v Mr Jon Wakefield (Oxford County Court) (12 October 2021), Case No: G00MK161.

Article 5(1)(a) GDPR.

Damian Clifford and Jef Ausloos, ‘Data Protection and the Role of Fairness’ (2018) 37 Yearbook of European Law 130.

Reporting on the findings from national rapporteurs see, Orla Lynskey, ‘General Report Topic 2: The New EU Data Protection Regime’ in Jorrit Rijpma (ed), The New EU Data Protection Regime: Setting Global Standards for the Right to Personal Data Protection (Eleven International Publishing 2020) 23, 36.

Lancieri, ‘Data Protection’s Enforcement Gap’ (n 142) 28–55.

Giulia Gentile and Orla Lynskey, ‘Deficient-By-Design? The Transnational Enforcement of the GDPR’ (2022) 71 International and Comparative Law Quarterly 799.

Eugene Bardach and Robert A. Kagan, Going by the Book: The Problem of Regulatory Unreasonableness (2nd edn, Transaction Publishers 2003) 7.

Black, Rules and Regulators (n 132) 43–44.

Practically, the remaining option for a data subject to initiate private enforcement action against a data controller for breach of the GDPR would seemingly undermine any attempt to mitigate the hard edges of the law by public enforcers applying site-level flexibility.

Case C-13/16, Rīgas satiksme ECLI:EU:C:2017:336.

Case C-13/16, Rīgas satiksme ECLI:EU:C:2017:336, Opinion of AG Bobek, para 93.

ibid, para 95.

ibid, para 96. As in any other area of law, rules governing certain activity must be sufficiently flexible in order to catch all the potential eventualities that arise. That might, however, lead to the danger of an overbroad interpretation and application of those rules. They might end up being applied also to a situation where the link with the original purpose is somewhat tenuous and questionable.

This lighter touch would be needed in situations ‘when a person is asking for an individual piece of information relating to a specific person in a concretised relationship, when there is a clear and entirely legitimate purpose resulting from the normal operation of the law’. ibid, para 98.

Lorenzo Dalla Corte, ‘On Proportionality in the Data Protection Jurisprudence of the CJEU’ (2022) 12 International Data Privacy Law 259, 265.

Google Spain , Opinion of AG Jääskinen (n 115), para 79. He deemed it inappropriate to apply a law that was drafted prior to the emergence of the decentralised internet teleologically (paras 77 and 78).

ibid, paras 89 and 90.

Miquel Peguera, ‘The Shaky Ground of the Right to Be Delisted’ (2016) 18 Vanderbilt Journal of Entertainment and Technology Law 507, 539.

C-137/17, GC and Others v Commission nationale de l’informatique et des libertés (CNIL) ECLI:EU:C:2019:773 para 31.

Google Spain (n 2) para 38; repeated at para 83.

GC and Others (n 189) paras 42 and 43.

ibid, para 44.

ibid, para 45.

Rights of individuals are clearly found in a chapter of the law labelled ‘Rights of the data subject’ while the Article 9 prohibition is found in the ‘Principles’ chapter.

Karen Yeung and Lee A. Bygrave, ‘Demystifying the Modernized European Data Protection Regime: Cross-disciplinary Insights from Legal and Regulatory Governance Scholarship’ (2022) 16 Regulation & Governance 137, 146.

Article 57(1)(f) GDPR.

Access Now (n 128) 41.

Hijmans, for instance, observes that ‘DPAs are free to set their own agenda, but with one limitation which is their obligation to handle complaints’. Hielke Hijmans, The European Union as Guardian of Internet Privacy (Springer 2016), 383.

Article 78(2) GDPR.

Ian Ayres and John Braithwaite, Responsive Regulation: Transcending the Deregulation Debate (OUP 2002) 35.

Case C-311/18, Data Protection Commissioner v Facebook Ireland Limited and Maximillian Schrems (Schrems II) ECLI:EU:C:2020:559, para 108.

ibid, para 112.

S.109(2) Data Protection Act 2018 (Ireland).

Article 83(1) GDPR.

Recital 148.

David Erdos, ‘Ensuring Legal Accountability of the UK Data Protection Authority: From Cause for Data Subject Complaint to a Model for Europe?’ (2020) 5 European Data Protection Law Review 444, 452.

Recital 131.

Article 78(2) and (1), respectively.

A lacuna explored, but not filled, in the UK case of Killock & Veale v ICO [2021] UKUT 299 (AAC).

Google Spain , Opinion of AG Jääskinen (n 115) para 30. He highlighted that currently ‘the broad definitions of personal data, processing of personal data and controller are likely to cover an unprecedentedly wide range of new factual situations due to technological developments’.

De Gregorio (n 114) 141.

For a more critical assessment of the power wielded by Google Search see Powles (n 117).

Therefore while it is often claimed that the law is designed to be technologically neutral, we cannot claim that the law applies in a way that is technologically neutral.

Purtova, ‘The Law of Everything’ (n 9) 71.

Michèle Finck, ‘Blockchains and Data Protection in the EU’ (2018) 1 European Data Protection Law Review 17, 30–31.

System design cannot only frustrate rights but often entails trade-offs between rights that are not made explicit by the law. See further, Michael Veale, Reuben Binns and Jef Ausloos, ‘When Data Protection by Design and Data Subject Rights Clash’ (2018) 8 International Data Privacy Law 105.

Commission Staff Working Document (n 127) 5.

European Parliament, ‘European Parliament resolution of 25 March 2021 on the Commission evaluation report on the implementation of the General Data Protection Regulation two years after its application (2020/2717(RSP))’ [2021] C494/29, para 13.

EDPB, ‘Overview on resources’ (n 134) 14.

ICO, Information Commissioner’s Annual Report and Financial Statements 2021–22 July 2022 HC 392, 33.

ICO, ‘ICO reprimands Surrey Police and Sussex Police for recording more than 200,000 phone calls without people’s knowledge’, 18 April 2023.

Fuller (n 146). These criteria also reflect those set out by Diver in his work on the optimal precision of legal rules. He notes that the success of a rule will depend on qualities such as its transparency (whether the words have a well defined and universally accepted meaning within the relevant community) and their accessibility (their application to concrete situations without excessive difficulty or effort). Colin S. Diver, ‘The Optimal Precision of Administrative Rules’ (1983) 93 Yale Law Journal 65.

Joris van Hoboken, ‘From Collection to Use in Privacy Regulation? A Forward-Looking Comparison of European and US Frameworks for Personal Data Processing’ in Bart van der Sloot, Dennis Broeders and Erik Schrijvers (eds), Exploring the Boundaries of Big Data (Amsterdam University Press 2016) 231, 248.

Lee A. Bygrave, Data Protection Law: Approaching Its Rationale, Logic, and Limits (Kluwer Law International 2002), 107–112.

Koops (n 9) 257.

Peter Blume, The Data Subject, (2015) 1 Eur Data Prot L Rev 42.

Gellert (n 11) 215.

The role of risk in data protection law remains ambiguous. As Yeung and Bygrave note, although regulatory scholars are familiar with the idea of ‘risk’ in various guises, the concept of ‘risk to rights’ is unfamiliar and the traditional focus of risk on quantifying tangible harms sits uneasily alongside the dignitarian basis for human rights. Yeung and Bygrave (n 170) 143.

Quelle, for instance, suggest that this formula serves the function of maintaining a broad scope of application for the data protection rules while ‘keeping the consequences of controllership in check’. Claudia Quelle, ‘GC and Others v CNIL on the Responsibility of Search Engine Operators for Referring to Sensitive Data: The End of ‘Right to be Forgotten’ Balancing?’ (2019) 5 Eur Data Prot L Rev 438, 440.

Giorgio Monti, ‘Article 81 EC and Public Policy’ 2002(39) Common Market Law Review 1057, 1088.

Lee A. Bygrave, Data Privacy Law: an International Perspective (OUP 2014) 147; De Gregorio (n 114) 141.

Lee A Bygrave and Dag Wiese Schartum, ‘Consent, Proportionality and Collective Power’ in Serge Gutwirth and others (eds), Reinventing Data Protection (Springer 2009), 162.

Case C-439/19, Latvijas Republikas Saeima (Points de pénalité) EU:C:2021:504, para 97.

ibid, para 98. In the penalty points case, the Court affirmed that the principle of data minimisation (Article 5(1)(c) GDPR) ‘gives expression to the principle of proportionality’.

Jan Freigang, ‘Is Responsive Regulation Compatible with the Rule of Law’ (2002) 8 European Public Law 463.

Erdos, ‘Ensuring Legal Accountability’ (n 207). The one-stop-shop and consistency mechanisms foreseen in Chapter VII, Sections 1 and 2 GDPR are ill equipped to force an authority to handle a complaint in a particular manner: Gentile and Lynskey (n 176).

Easy GDPR, ‘GDPR fine for Austrian kebab store’, https://easygdpr.eu/en/gdpr-incident/gdpr-fine-for-austrian-kebab-store/ ; One Trust Data Guidance, ‘Spain: AEPD fines individual €6,000 for unlawfully processing personal data’ https://www.dataguidance.com/news/spain-aepd-fines-individual-600-data-minimisation

The importance of transparency in this regard has been emphasised by the European Parliament which has called for harmonisation of penalties by means of guidelines and clear criteria ‘in order to increase legal certainty and to prevent companies settling in the locations that impose the lowest penalties’. European Parliament (n 191)

ICO, ‘ICO Sets Out Revised Approach to Public Sector Enforcement’ (30 June 2022).

Erdos Balancing on a Tightrope (n 14) 199.

Roger Brownsword and Morag Goodwin, Law and the Technologies of the Twenty-First Century: Text and Materials (Law in Context) (CUP 2012), 310.

Kenneth A. Bamberger and Deirdre K. Mulligan, Privacy on the Ground: Driving Corporate Behavior in the United States and Europe (MIT Press 2015), 9.

Brownsword and Goodwin (n 241) 312.

Simitis (n 119).

Month: Total Views:
October 2023 154
November 2023 223
December 2023 319
January 2024 328
February 2024 374
March 2024 453
April 2024 681
May 2024 776
June 2024 447
July 2024 489
August 2024 404

Email alerts

Citing articles via.

  • Recommend to your Library

Affiliations

  • Online ISSN 2044-8422
  • Print ISSN 0070-1998
  • Copyright © 2024 University College London
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Biomed Res Int

Logo of bmri

Privacy Protection and Secondary Use of Health Data: Strategies and Methods

Dingyi xiang.

1 Internet Rule of Law Institute, East China University of Political Science and Law, Shanghai, China

2 Humanities and Law School, Northeast Forest University, Harbin, Heilongjiang, China

3 Beidahuang Information Company, Harbin, Heilongjiang, China

Health big data has already been the most important big data for its serious privacy disclosure concerns and huge potential value of secondary use. Measurements must be taken to balance and compromise both the two serious challenges. One holistic solution or strategy is regarded as the preferred direction, by which the risk of reidentification from records should be kept as low as possible and data be shared with the principle of minimum necessary. In this article, we present a comprehensive review about privacy protection of health data from four aspects: health data, related regulations, three strategies for data sharing, and three types of methods with progressive levels. Finally, we summarize this review and identify future research directions.

1. Introduction

The rapid development and application of multiple health information technologies enabled medical organizations to store, share, and analyze a large amount of personal medical/health and biomedical data, of which the majority are electronic health records (EHR) and genomic data. Meanwhile, the emerging technologies, such as smart phones and wearable devices, also enabled third-party firms to provide many kinds of complementary mHealth services and collect huge tons of consumer health data. Health big data has already been the most important big data for its serious privacy disclosure concerns and huge potential value of secondary use.

Health big data stimulated the development of personalized medicine or precision medicine. Empowered by health informatics and analytic techniques, secondary use of health data can support clinical decision making; extract knowledge about diseases, genetics, and medicine; improve patients' healthcare experiences; reduce healthcare costs; and support public health policies [ 1 – 3 ]. On the other side of the coin, health data contains much personal privacy and confidential information. For the guidance of protecting health-related privacy, the Health Insurance Portability and Accountability Act (HIPAA) of the US specifies 18 categories of protected health information (PHI) [ 4 ]. The heavy concerns about privacy disclosure much hinder secondary use of health big data. Much efforts tried to balance between privacy management and health data secondary use from both the legislation side [ 5 ] and the technology side [ 6 , 7 ]. But for much more circumstances, a perfect balance is difficult to achieve; instead, a certain tradeoff or compromise must always be made. Recently, COVID-19 may perfectly illustrate the conundrum between protecting health information and ensuring its availability to meet the challenges posed by a significant global pandemic. In this ongoing battle, China and South Korea have mandated public use of contact tracing technologies, with few privacy controls; other countries are also adopting contact tracing technologies [ 7 ].

The direct and also important strategy to balance both issues is reusing health data under the premise of protecting privacy. The most primary idea is to share deidentified health data by removing 18 specified PHI. Based on deidentified health data, machine learning and data mining can be used for knowledge extraction or learning health system building for the purpose of analyzing and improving care, whereby treatment is tailored to the clinical or genetic features of the patient [ 8 ]. However, transforming data or anonymizing individuals may minimize the utility of the transferred data and lead to inaccurate knowledge [ 9 ]. This tradeoff between privacy and utility, also accuracy, is the center issue of sensitive data secondary usage [ 10 ]. Deidentification refers to a collection of techniques devised for removing or transforming identifiable information into nonidentifiable information and also introducing random noise into the dataset. By deidentification, privacy protection will be leveraged, but the outcome of analysis may be not exact, rather an approximation. To reconcile this conflict, the privacy loss parameter, also called privacy budget, was proposed to tune the tradeoff between privacy and accuracy: by changing the value of this parameter, more or less privacy resulting in less or more accuracy, respectively [ 11 ]. Furthermore, deidentified data may become reidentifiable through data triangulation from other datasets, which means that the privacy harms of big health data arise not merely in the collection of data but in their eventual use [ 12 ]. Just deidentification is far from needed. Instead, a holistic solution is the right direction, by which the risk of reidentification from records should be kept as low as possible and data be shared with the principle of minimum necessary [ 13 ]. For the minimum necessary, user-controlled access [ 6 , 14 ] and secure network architecture [ 15 ] can be a practical implementation. For effective reusing health data while reducing the risk of reidentification, attempts in three aspects can be applicable references, that is, risk-mitigation methods, privacy-preserving data mining, and distributed data mining without sharing out data.

The remainder of this paper is organized as follows. Section 2 describes the scope of health data and its corresponding category. Section 3 summarizes regulations about privacy protection of health data in several countries. Section 4 concisely reviews two strategies for privacy protection and secondary use of health data. Section 5 reviews three aspects of tasks and methods for privacy preservation and data mining the primary tasks of data mining. Section 6 concludes this study.

2. Health Data and Its Category

Generally speaking, any data associated with users' health conditions can be viewed as health data. The most important health data is clinical data, especially electronic medical records (EMR), produced by different level hospitals. With the development of health information technology and the popularization of wearable health device, vast amounts of health-relevant data, such as monitored physiological data and diet or exercise data, are collected from individuals and entities elsewhere, both passively and actively. According to the review article by Deven McGraw and Kenneth D. Mandl, health-relevant data can be classified into four categories [ 7 ]. In this research, we focus on the first two categories of data, which are directly related to users' health and privacy.

Category 1. Health data generated by healthcare system. This type of data is clinical data and is recorded by clinical professionals or medical equipment when a patient gets healthcare service in a hospital or clinic. Clinical data includes EMR, prescriptions, laboratory data, pathology images, radiography, and payor claims data. Patients' historical condition and current condition are recorded for treatment requirement. For making better health service for patients, it is important to track patients' lifelong clinical data and make clinical data sharing among different healthcare providers. Personal health record (PHR) was proposed to integrate patients' cross-institutions and lifelong clinical data [ 16 ]. This type of health data is generated and collected routinely in the process of healthcare, with the explicit aim that those data be used for the purpose of analyzing and improving care. For the purpose of clinical treatment, and also because of consumers' firm trust on healthcare experts and institutions, clinical data contains a high degree of health-related privacy. Therefore, the majority of health privacy laws mainly cover the privacy protection of clinical data [ 7 ]. Under the constraints of health privacy laws, tons of clinical data have been restricted only for internal use in medical institutions. Meanwhile, the clinical data is also extremely valuable for secondary usage since the data is created by professional experts and is direct description of consumers' health conditions. The tradeoff between utility and privacy of this type of health data has been one of the most important issues in the age of medical big data.

Category 2. Health data generated by consumer health and wellness industry. This type of health data is an important complementation to clinical data. With the widespread application of new-generation information technology, such as IoT, mHealth, smart phone, and wearable device, consumers' health attitude has greatly changed from passive treatment to active health. Consumers' health data can be generated through wearable fitness tracking devices, medical wearables such as insulin pumps and pacemakers, medical or health monitoring apps, and online health service. These health data can include breath, heart rate, blood pressure, blood glucose, walking, weight, diet preference, position, and online health consultation. These products or services and health data play important role in consumers' daily heath management, especially for chronic disease patients. This area has gained more and more focus from industry and academia. Consumer health informatics is the representative direction [ 17 ]. This type of nontraditional health-relevant data, often equally revealing of health status, is in widespread commercial use and, in the hands of commercial companies, yet often less accessible by providers, patients, and public health for improving individual and population health [ 18 ]. These big health data are scattered across institutions and intentionally isolated to protect patient privacy. For this type of health data, integration and linking at individual level are an extra challenge except for the utility-privacy tradeoff.

Table 1 summarizes the two categories of health data and their comparative features.

Summarization of clinical data and consumer health data.

Category 1: clinical dataCategory 2: consumer health data
Generated/record byHealthcare system
Clinical professionals
Medical equipment
Wearable device (wristband, watch)
Medical wearable
Health App
Data detailName, id, age, address, phone, medical history, family history, conditions, laboratory test, treatments, prescriptions, etc.Name, id, phone, address, position, age, weight, heart rate, breath, blood pressure, blood glucose, exercise data, diet preference, online health consultation, etc.
Data characteristicsDiscrete but more professional, more clinical information and more privacy, stored in healthcare system, passiveContinuous but less standardization, more health information, privacy tend to be ignored, stored by different providers, active, vast amounts

3. Regulations about Privacy Protection of Health Data

Personal information and health-relevant data are necessary to record in order to provide regular health service. Meanwhile, personal information and health-relevant data are closely associated with user privacy and confidential information. Therefore, several important privacy protection-related regulations or acts are published to guide health data protection and reuse. Modern data protection law is built on “fair information practice principles” (FIPPS) [ 19 ].

The most referenced regulation is Health Insurance Portability and Accountability Act (HIPAA) [ 4 ]. HIPAA was created primarily to modernize the flow of healthcare information, stipulate how personally identifiable information maintained by the healthcare and healthcare insurance industries should be protected from fraud and theft, and address limitations on healthcare insurance coverage. The HIPAA Safe Harbor (SH) rule specifies 18 categories of explicitly or potentially identifying attributes, called protected health information (PHI), that must be removed before the health data is released to a third party. HIPAA also covers electronic PHI, ePHI. This includes medical scans and electronic health records. A full list of PHI elements is provided in Table 2 . PHI elements in Table 2 only cover identity information and do not include any sensitive attribute. That is, HIPAA does not provide guidelines on how to protect sensitive attribute data; instead, the basic idea of the HIPAA SH rule is to protect privacy by preventing identity disclosure. However, other sensitive attributes may still uniquely combine into a quasi-identifier (QI), which can allow data recipients to reidentify individuals to whom the data refer. Therefore, a strict implementation of the SH rule, however, may be inadequate for protecting privacy or preserving data quality. Recognizing this limitation, HIPAA also provides alternative guidelines that enable a statistical assessment of privacy disclosure risk to determine if the data are appropriate for release [ 20 ].

Protected health information defined by HIPAA.

CategoryDescription
1Names
2Locations
3Dates
4Phone number
5Fax numbers
6E-mail addresses
7Social security numbers
8Medical record numbers
9Health plan beneficiary numbers
10Account numbers
11Certificate/license numbers
12Vehicle identifiers and serial numbers
13Device identifiers and serial numbers
14Web Universal Resource Locators (URLs)
15Internet Protocol (IP) address numbers
16Biometric identifiers, including finger and voice prints
17Full face photographic images and any comparable images
18Any other unique identifying number, characteristics, or code

The Health Information Technology for Economic and Clinical Health (HITECH) Act [ 21 ] was enacted as part of the American Recovery and Reinvestment Act of 2009 to promote the adoption and meaningful use of health information technology. Subtitle D of the HITECH Act addresses the privacy and security concerns associated with the electronic transmission of health information, in part, through several provisions that strengthen the civil and criminal enforcement of the HIPAA rules. It is complimentary with HIPAA and strengthens HIPAA's privacy regulations. HITECH has also widened the scope of HIPAA through the Omnibus Rule. This extends the privacy and security reach of HIPAA/HITECH to business associates. According to HIPAA and HITECH Act, much of data beyond category 1 in Table 1 is outside of the scope of comprehensive health privacy laws in the U.S.

The Consumer Data Right (CDR) [ 22 ] is coregulated by the Office of the Australian Information Commissioner (OIAC) and Australian Competition and Consumer Commission (ACCC). “My Health Record System” is run to track citizen medical conditions, test results, and so on. The OIAC sets out controls on how health information in a My Health Record can be collected, used, and disclosed, which corresponds to PHR integration. The Personal Information Protection and Electronic Documents Act (PIPEDA) [ 23 ] of Canada applies to all personal health data. PIPEDA is stringent and although has many commonalities with HIPAA; it goes beyond HIPAA requirements in several areas. One such area is in the protection of data generated by mobile health apps which is not strictly covered by HIPAA. PIPEDA runs to protected consumer health data. Under PIPEDA, organizations can seek implied or explicit consent, which is based on the sensitivity of the personal information collected and the reasonable data processing consent expectations of the data subject. The General Data Protection Regulation (GDPR) is a wide-ranging data protection regulation in EU, which covering health data as well as all other personal data, even they contain sensitive attributes. GDPR also has data consent and breach notification expectations and contains several key provisions, including notification, right to access, right to be forgotten, and portability. Under GDPR, organizations are required to gain explicit consent from data subjects, and individuals have the right to restriction of processing and not to be subject to automated decision-making.

China has no specific regulations for health data privacy protection. Several restriction rules to prohibit privacy disclosure scatter in China Civil Code (CCC), Medical Practitioners Act of the PRC (MPAPRC), and Regulations on Medical Records Management in Medical Institutions (RMRMMMI), which make privacy disclosure restrictions to individuals, medical practitioners, and medical institutions, respectively. CCC specifies 9 categories of personal information to be protected, including name, birthday, ID number, biometric information, living address, phone number, email address, health condition information, and position tracking information. RMRMMMI only approves reuse of health data just for medical care, teaching, and academic research. Recently, the Personal Information Protection Law of the PRC (PIPILRC) [ 24 ] is released and will come into force on November 1, 2021. This is the first complete and comprehensive regulation on personal information protection. In this regulation, the definition of sensitive personal information and automatic decision making both involve health data, so, this regulation is applicable to privacy protection of health data. According to this regulation, secondary use of deidentified or anonymized health data for automatic decision making is permitted, and data processing consent from consumers is also required. This regulation, so far as can be foreseen, will greatly stimulate the exploitation and exploration of health big data.

According to the comparison of these data privacy relevant regulations, shown in Table 3 , PIPEDA and GDPR and the newly released PIPILRC can cover both clinical data and consumer health data, and others pay the majority of attention to clinical data. Health data need to be reused for multiple important purposes. In fact, health data processing and reusing are never absolutely prohibited in the regulations mentioned above, as long as privacy protection is achieved as the important prerequisite. In this respect, HIPAA sets Safe Harbor rules to make sure PHI be removed before the health data is released to a third party. Furthermore, PIPEDA and GDPR require consumers' consent for data processing. Regulations from China also encourage health data to be reused in certain restricted areas. As the newcomer, PIPILRC presents a more complete and comprehensive guidance to protect and process health data.

Regulations and corresponding data category.

RegulationsCategory 1: clinical dataCategory 2: consumer health data
HIPAA & HITECH (USA)
CDR (Australia)
PIPEDA (Canada)
GDPR (EU)
MPAPRC & RMRMMMI (China)
CCC & PIPILRC (China)

4. Strategies and Framework

The exploitation of health data can provide tremendous benefits for clinical research, but methods to protect patient privacy while using these data have many challenges. Some of these challenges arise from a misunderstanding that the problem should be solved by a foolproof solution. There exists a paradox: well deidentified and scrubbed data may lose much meaningful information results in low quality, maintaining much PHI may have high risk of privacy breach. Therefore, a holistic solution, or to say a unified strategy, is needed. Three strategies are summarized in this section. The first is for clinical data and provides a practical user access rating system, and the second is majority for genomic data and designs a network architecture to address both security access and potential risk of privacy disclosure and reidentification. From a more practical starting point, the third tries to share a model without exposing any data. The tree strategies present solutions from different perspectives, therefore can be complementary to each other.

4.1. Strategies for Clinical Data

As for clinical data, Murphy et al. proposed an effective strategy to build a clinical data sharing platform while protecting patient privacy [ 6 ]. The proposed approach to resolving the balance between privacy management and data secondary use is to match the level of data deidentification with the trustworthiness of the data recipients, in which the more identified the data, the more “trustworthy” the recipients are required to be, and vice versa. The level of trust for a data recipient becomes a critical factor in determining what data may be seen by that person. This type of hierarchical access rating is similar to the film rating, which can accommodate the requirement and appetites of different types of audiences. Murphy et al.'s strategy sets up five patient privacy levels with three aspects of requirements: availability of the data, trust in the researcher and the research, and the security of the technical platforms. Corresponding to the privacy levels are five user role levels.

The lowest level of user is “obfuscated data user.” For this user, data are obfuscated as it is served to a client machine with possibly low technical security. Obfuscation methods try to add a random number to the aggregated counts instead of providing accurate result [ 25 , 26 ]. The second level of user is “aggregated data user,” to whom exact numbers from aggregate query results are permissible. The third is “LDS data user,” who is granted to access HIPAA-defined LDS (limited dataset) and structured patient data in which PHI must be removed. The fourth is “Notes-enabled LDS data user,” who is additionally allowed to view PHI scrubbed text notes (such as discharge summary). The final level of user is “PHI-viewable data user,” who has access to all patient data.

These access level categories are summarized in Table 4 .

Health data access level categories.

Privacy level of userData availableTrustworthiness of userTechnical security
Obfuscated data userUsers have access to data by client-side application onlyLow: only obfuscated aggregate results are availableLow: only client-side application exposed to users
Aggregated data userUsers have access to HIPAA deidentified data by client-side application onlyLow: users can get exact patient counts against deidentified dataLow: but data manager assumes burden of deidentifying data
LDS data userHIPAA-defined LDS and deidentified structured dataMedium: users can see LDS as defined by HIPAAMedium: requires user-facing direct access to the database
Notes-enabled LDS data userHIPAA deidentified data and deidentified narrative textMedium: users see both LDS and narrative text that is mostly deidentifiedMedium: requires user-facing direct access to the database
PHI-viewable data userAll patient data may be accessedHigh: users can see all protected health information on patientsHigh: requires management of encryption keys

With the guidance of health data access level categories, Murphy et al. implemented five cases in clinical research. In a realistic project, multiple use role or different access privileges must be needed to reconcile different data access requirements. Murphy et al. also provided three exemplar projects and their possible privacy level user distributions. This proposed strategy gave a complete reference for data sensitive project and also implemented a holistic approach to patient privacy solutions in Informatics for Integrating Biology and the Bedside (i2b2) research framework [ 27 ]. The i2b2 framework is the most widespread open-source framework for exploring clinical research data-warehouses and was jointly developed by the Harvard Medical School and Massachusetts Institute of Technology to enable clinical researchers to use existing deidentified clinical data and only IRB-approved genomic data for research aims. Yet, i2b2 does not provide any specific protection mechanism for genomic data.

4.2. Strategies for Genomic Data

As for genomic data, two potential privacy threats are loss of patients' health data confidentiality due to illegitimate data access and patients' reidentification and resulting sensitive attribute disclosure from legitimate data access. On the basis of the i2b2 framework, Raisaro et al. [ 15 ] proposed to apply homomorphic encryption [ 28 ] to the first threat and differential privacy [ 29 ] to the second threat. Furthermore, Raisaro et al. designed a system model, consisting of two physically separated networks, from the perspective of architecture. The network architecture is shown in Figure 1 . This network architecture is aimed at isolating data that is used for clinical/medical care and that is used for research activities by a few trusted and authorized individuals.

An external file that holds a picture, illustration, etc.
Object name is BMRI2021-6967166.001.jpg

Network architecture of privacy protection for health data including genomic data.

The clinical network is used for hospital's clinical daily activities, containing clinical and genomic data of patients. This network is very controlled and protected by a firewall that blocks all incoming network traffic. Authorized users are permitted to log in.

The research network hosts i2b2 service used by researchers in their research activities. The i2b2 service is composed of an i2b2 server and a proxy server, in which a homomorphic encryption method and a differential privacy method are implemented and deployed. The i2b2 server can receive deidentified clinical data and encrypted genomic data from the clinical network and perform security data query and computation. The proxy server is devoted to support the decryption phase and the storage of partial decryption keys for homomorphic encryption. Through the research network, researchers can get authorized data via query execution module by the sequential five steps: query generation, query processing, result perturbation, result partial decryption, and result decryption at the final user-client side.

This network architecture and its privacy-preserving solution have been successfully deployed and tested in Lausanne University Hospital and used for exploring genomic cohorts in a real operational scenario. This application is also a practicable demonstration for similar scenario. It is not a unique instance but has its counterpart. Azencott reviewed how breaches in patient privacy can occur, and recent developments in computational data protection also proposed a similar secure framework for genomic data sharing around three aspects, which includes algorithmic solutions to deidentification, database security, and user trustworthy access [ 3 ].

4.3. Strategies for Sharing Not Data but Models

Since the new paradigm of the machine learning method, namely, federated learning (FL), was first introduced in 2016 [ 30 ], has achieved a rapid development, and become a hot research topic in the field of artificial intelligence, its core idea is to train machine learning models on separate datasets that are distributed across different devices or parties, which can preserve the local data privacy to a certain extent. This development mainly benefits from the following three facts [ 31 ]: (1) the wide successful applications of machine learning technologies, (2) the explosive growth of big data, and (3) the legal regulations for data privacy protection worldwide.

The idea of federated learning is to only share the model parameters instead of the original data. By this way, many of these initiatives are based on federated models in which the actual data never leave the institution of origin, allowing researchers to share models without necessarily sharing patient data. Federated learning has inspired another important strategy to develop smart healthcare based on sensitive and private medical records which exist in isolated medical centers and hospitals. As shown in Figure 2 , federated learning offers a framework to jointly train a global model using datasets stored in separate clients.

An external file that holds a picture, illustration, etc.
Object name is BMRI2021-6967166.002.jpg

Architecture for a federated learning system.

Model building of this kind has been used in real-world applications where user privacy is crucial, e.g., for hospital data or text predictions on mobile devices, and it has been stated that model updates are considered to contain less information than the original data, and through the aggregation of updates from multiple data points, original data is considered impossible to recover. Federated learning emphasizes the data privacy protection of the data owner during the model training process. Effective measures to protect data privacy can better cope with the increasingly stringent data privacy and data security regulatory environment in the future [ 32 ].

5. Tasks and Methods

Under the strategies of health data protection, specific tasks and methods about privacy and data processing can be employed and deployed. The tasks and methods can be viewed at three progressive levels. Methods in the first level are aimed at mitigating the risk of privacy disclosure, from four aspects. Methods in the second level target on data mining or knowledge extraction from deidentified or anonymized health data. No need to share health data, methods in the third level try to build a learning model or extract knowledge in a distributed manner, then share the model or knowledge.

5.1. Risk-Mitigation Methods

There are two widely recognized types of privacy disclosure [ 33 ]: identity disclosure (or reidentification) and attribute disclosure. The former occurs when illegitimate data users try to match a record in a dataset to an individual, and the latter occurs when illegitimate data users try to predict the sensitive value(s) of an individual record. According to Malin et al. [ 34 ], methods of mitigating the risk of two types of privacy disclosure can be divided into four classes: suppression, generalization, randomization, and synthetization. This perspective of method categories expects to well summarize the recent research on risk-mitigation methods.

5.1.1. Suppression Methods

Suppression methods are aimed at scrubbing (remove or mask) 18 PHI defined in HIPAA, which is the most important deidentification method. Before PHI scrubbing, the major task is to identify the PHI from health data. For structural data, PHI identification can be done easily according to data schema. For narrative data or free text, such as discharge summary or progress note, natural language processing (NLP) is the preferred technology for PHI identification. Specifically, named entity recognition (NER) is the mainstream technology used in clinical data for deidentification and medical knowledge extraction. The 18 PHI are regarded as predefined entity types, and machine learning is employed to annotate type tags for each word in a sentence, then those tags are merged, and finally, the position and type of PHI can be identified. Conditional random fields (CRFs) are the classic sequential tagging model for NER and are often applied for deidentification [ 35 ]. Meystre et al. made a systematic review of deidentification methods [ 36 ], and Uzuner et al. [ 37 ] and Deleger et al. [ 38 ] both conducted some evaluations on a certain human-annotated dataset. The identified PHI values are then simply removed from or replaced with a constant value in the released text documents, which may be inadequate for protecting privacy or preserving data quality. Li and Qin proposed a new systematic approach to integrate methods developed in both data privacy and health informatics fields. The key novel elements of the proposed approach include a recursive partitioning method to cluster medical text records and a value enumeration method to anonymize potentially identifying information in the text data, which essentially masks the original values, to improve privacy protection and data utility [ 20 ].

For genomic data, homomorphic encryption [ 28 ] is applied to encrypting genomic data, and then, encrypted data can be shared for secondary use. Raisaro et al. employed homomorphic encryption to build a data warehouse for genomic data [ 15 ]. Kamm et al. [ 39 ] also proposed a framework for generating aggregated statistics on genomic data by using secure multiparty computation based on homomorphic secret sharing. Several other works [ 28 , 40 , 41 ] proposed using homomorphic encryption to protect genomic information in order to allow researchers to perform some statistics directly on the encrypted data and decrypt only the final result.

5.1.2. Generalization Methods

These methods transform data into more abstract representations. The much easier implementation is abbreviation. For instance, the age of a patient may be generalized from 1-year to 5-year age groups. Based on this type of generation, sensitive attributes can be generalized subgroup and be anonymized to some extent, which is the back idea of k -anonymity and its variations. k -anonymity seeks to prevent reidentification by stripping enough information from the released data that any individual record becomes indistinguishable from at least ( k − 1) other records [ 42 ]. The idea of k -anonymity is based on modifying the values of the QI attributes to make it difficult for an attacker to unravel the identity of persons in a particular dataset while the released data remain as useful as possible. This modification is a sort of generalization, by which stored values can be replaced with semantically consistent but less precise alternatives [ 43 ]. For example, let us consider a dataset in which age is a quasi-identifier. While the three records {age = 30, gender = male}, {age = 35, gender = male}, and {age = 31, gender = female} are all distinct, releasing them as {age = 3∗, gender = male}, {age = 3∗, gender = male}, and {age = 3∗, gender = female} ensures they all belong to the same age category and the anonymity is 3-anonymity. Based on k -anonymity, l -diversity [ 44 , 45 ] were proposed to address further disclosure issues of sensitive attributes.

5.1.3. Randomization Methods

Randomization can be used for attribute-level data. In this case, original sensitive values are replaced with similar but different values, with a certain probability. For example, a patient's name may be masked by a randomly selected made-up name. This basic approach may result in worse data quality. Li and Qin proposed to obtain value via a clustering method [ 20 ].

Randomization can further be used for aggregation operation. Obfuscation is a sort of such randomization. Numerous repetitions of a query by a single user must be detected and interrupted because they will converge on the true patient count making proper user identification absolutely necessary for the methods to function properly [ 6 ]. Aiming to deidentify aggregated data, obfuscation methods include the addition of a random number to the patient counts that has a distribution defined by a Gaussian function.18. Obfuscation is applied to aggregate patient counts that are reported as a result of ad hoc queries on the client machine [ 26 ]. Another protection model for preventing reidentification is differential privacy [ 10 , 46 ]. In this model, reidentification is prevented by the addition of noise to the data. The model is based on the fact that auxiliary information will always make it easier to identify an individual in a dataset, even if anonymized. Instead, differential privacy seeks to guarantee that the information that is released when querying a dataset is nearly the same whether a specific person is included or not [ 46 ]. Unlike other methods, differential privacy provides formal statistical privacy guarantees.

5.1.4. Synthetization Methods

Synthetization is compelling for two main reasons: preserving confidentiality and valid inferences for various estimates [ 47 ]. In this case, the original data are never shared. Instead, general aggregate statistics about the data are computed, and new synthetic records are generated from the statistics to create fake, but realistic-like, data. Exploiting clinical data for building an intelligent system is one of the scenarios. Developing clinical natural language processing systems often requires access to many clinical documents, which are not widely available to the public due to privacy and security concerns. To address this challenge, Li et al. proposed to develop methods to generate synthetic clinical notes and evaluate their utility in real clinical natural language processing tasks. Thanks to the development of deep learning, recent advances in text generation have made it possible to generate synthetic clinical notes that could be useful for training NER models for information extraction from natural clinical notes, thus lowering the privacy concern and increasing data availability [ 48 ].

5.2. Privacy-Preserving Data Mining

Data mining is also synonymously called knowledge discovery from data (KDD), which highlights the goal of the mining process. To obtain useful knowledge from data, the mining process can be divided into four iterative steps: data preprocessing, data transformation, data mining, and pattern evaluation and presentation. Based on the stage division in the process of KDD, Xu et al. developed a user-role-based methodology and identified four different types of users in a typical data mining scenario: data provider, data collector, data miner, and decision maker. By differentiating the four different user roles, privacy-preserving data mining (PPDM) can be explored in a principled way, by which all users care about the security of sensitive information but each user role views the security issue from its own perspective [ 49 ]. In this research, PPDM is explored from the view of a data miner role, that is, from the data mining stage of KDD.

Privacy-preserving data mining is aimed at mining or extracting information, via a certain machine learning-based model, from privacy-preserving data in which the values of individual records have been perturbed or masked [ 50 ]. The key challenge is that the privacy-preserving data look very different from the original records and the distribution of data values is also very different from the original distribution. Researches for this issue have started very early. Agrawal and Srikant proposed a reconstruction procedure to estimate the distribution of original data values and then built a decision-tree classifier [ 50 ]. Recent studies on PPDM include privacy-preserving association rule mining, privacy-preserving classification, and privacy-preserving cluster.

Association rule mining is aimed at finding interesting associations and correlation relationships among large sets of data items. For PPDM, some of the rules may be considered to be sensitive. For hiding these rules, the original data need to be modified to generate a sanitized dataset from which sensitive rules cannot be mined, while those nonsensitive ones can still be discovered [ 51 ]. Classification is a task of data analysis that learns models to automatically classify data into defined categories. Privacy-preserving classification evolves decision tree, Bayesian model, support vector machine, and neural classification. The strategies of adapting the classification method to a privacy-preserving scenario can simply be described as two aspects. The first is learning the classification model based on data transformation, since the transformed data is difficult to be recovered [ 52 , 53 ]. The second is learning the classification model based on secure multiparty computation (SMC) [ 54 ], where multiparties collaborate to develop a classification model from vertically partitioned or horizontally partitioned data, but no one wants to disclose its data to others [ 55 , 56 ]. Cluster analysis is the process of grouping a set of records into multiple groups or clusters so that objects within a cluster have high similarity but are very dissimilar to objects in other clusters. This process runs in an unsupervised manner. Similar to classification, current researches on privacy-preserving clustering can be roughly categorized into two types, based on data transformation [ 57 , 58 ] and based on secure multiparty computation [ 59 , 60 ].

5.3. Federated Privacy-Preserving Data Mining

For the distributed or isolated data, distributed data mining is the research topic. Distributed data mining can be further categorized into data mining over horizontally partitioned data and data mining over vertically partitioned data. Research on distributed data mining attracts much attention. To overcome the difficulty of data integration and promote efficient information exchange without sharing sensitive raw data, Que et al. developed a Distributed Privacy-Preserving Support Vector Machine (DPP-SVM). The DPP-SVM enables privacy-preserving collaborative learning, in which a trusted server integrates “privacy-insensitive” intermediary results [ 61 ]. In medical domain, much raw data can hardly leave the institution of origin. Instead of bringing data to a central repository for computation, Wu et al. proposed a new algorithm, Grid Binary LOgistic REgression (GLORE), to fit a LR model in a distributed fashion using information from locally hosted databases containing different observations that share the same attributes [ 62 ].

It is worth to note that learning (classification or clustering) on secure multiparty computation is an important distributed learning strategy, by which privacy disclosure concern can be much reduced since data need not to be shared out. This research topic probably inspired federated machine learning [ 30 , 32 ]. Today's AI still faces two major challenges. One is that data exists in the form of isolated islands. The other is the strengthening of data privacy and security. The two challenge is much severer in the healthcare domain. Federated machine learning is aimed at building a learning model from decentralized data [ 30 ]. Federated learning can be classified into horizontally federated learning, vertically federated learning, and federated transfer learning based on how data is distributed among various parties in the feature and sample ID space [ 32 ]. Horizontal federated learning, or sample-based federated learning, is introduced in the scenarios that datasets share the same feature space but different in samples. At the end of the learning, the universal model and the entire model parameters are exposed to all participants. Vertical federated learning or feature-based federated learning is applicable to the cases that two datasets share the same sample ID space but differ in feature space. At the end of learning, each party only holds the model parameters associated with its own features; therefore, at inference time, the two parties also need to collaborate to generate output. Federated transfer learning (FTL) applies to the scenarios that the two datasets differ not only in samples but also in feature space. FTL is an important extension to the existing federated learning systems and is more similar to vertical federated learning. The challenge of protecting data privacy while maintaining the data utility through machine learning still remains. For a comprehensive introduction of federated privacy-preserving data mining, please refer to the survey based on the proposed 5 W-scenario-based taxonomy [ 31 ].

5.4. Summary: Privacy vs. Accuracy

Privacy protection is the indispensable prerequisite of secondary usage of health data. As discussed above, risk-mitigation methods are aimed at anonymizing private or sensitive information so as to reduce the risk of reidentification. Methods about privacy-preserving data mining target to process the privacy-scrubbed data and extract knowledge and even build AI systems. If absolute privacy safe is pursued, the scrubbed data is definitely useless, since the data quality is severely corrupted. With the poor-quality data, accuracy and effectiveness of data utilization are extremely affected. Therefore, in a practical scenario, a certain tradeoff or compromise between privacy and accuracy must always be made. The tradeoff can be tuned to provide more or less privacy resulting in less or more accuracy, respectively, according to the requirements of privacy level and utility level. Federated privacy-preserving data mining sheds light on the new direction to compromise, even to balance, the privacy and accuracy. No need to share data out, federated privacy-preserving data mining first processes the original health data within institutions, and the conduct federated mining or learning. This type of method is expected to reconcile privacy and accuracy with more elegant style and more acceptable way.

6. Conclusions

Clinical data, genomic data, and consumer health data are the majority of health big data. Protection and reuse always gain much focused research topics. In this review article, the type and scope of health data are firstly discussed, followed by the related regulations for privacy protection. Then, strategies for user-controlled access and secure network architecture are presented. Sharing trained model without original data leaving out is a new important strategy and gains more and more focus. According to different data reuse scenarios, tasks and methods at three different levels are summarized. The strategies and methods can be combined to form a holistic solution.

With the rapid develop health information technology and artificial intelligence, the capability of privacy protection will impede the urgent demand of reusing health data. Some potential research directions may include (1) applying modern machine learning to deidentification and anonymization for multimodal health data while ensuring its data quality; (2) learning model construction and knowledge extraction based on anonymized data to leverage secondary use of health data; (3) federated learning on isolated heath data can both protect privacy perfectly and improve the efficiency of data transferring and processing, being deserved more attention; (4) research on alleviating reidentification risk, such as linkage or inference, from a trained model.

Acknowledgments

This study was funded by the China Postdoctoral Science Foundation Grant (2020M671059) and the Fundamental Research Funds for the Central Universities (2572020BN02).

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Data Privacy and Data Protection: The Right of User’s and the Responsibility of Companies in the Digital World.

12 Pages Posted: 28 Feb 2022

Princess U. Alafaa

Nigerian Law School, Lagos

Date Written: January 7, 2022

With the continuous advancement in technology and massive increase in internet usage, the concepts of data privacy and data protection is a hugely debated topic. This is because, the service providers who manage the websites, applications and social media platforms often collect and store user’s personal data with the objective of providing adequate services to best suit each user’s preference. Usually, these digital service companies are saddled with the responsibility of protecting the personal data of the users from unauthorised access and against all vulnerabilities. However, instances arise where these platforms fail to adequately place safeguards to protect the data collected and this results to a data breach and exposure of users sensitive data to unauthorised parties who can use the personal data to defraud and harass the users or to send unwanted adverts without the users consent. Thus, infringing on the users’ fundamental right of privacy and freedom to freely express themselves. Hence, the need for companies to adopt defensive mechanisms to ensure an adequate protection of users’ personal data and also awareness by the users that they have a right of control over which personal data to share and with whom it is shared. This paper is divided into three parts with the first discussing the rights of users and responsibilities of companies as well as the established regulations in the protection of data. The second part of this work considers the issues surrounding data privacy and data protection and the challenges faced in ensuring the safety of users’ personal data. Finally, the last part offers a series of recommendations and conclusion.

Keywords: Data Privacy, Data Protection, User, Data Breach, Privacy Policy, General Data Protection Regulation (GDPR), Nigerian Data Protection Regulation (NDPR), California Consumer Privacy Act (CCPA)

JEL Classification: K24, C8, O3

Suggested Citation: Suggested Citation

Princess Alafaa (Contact Author)

Nigerian law school, lagos ( email ), do you have a job opening that you would like to promote on ssrn, paper statistics, related ejournals, cybersecurity, privacy, & networks ejournal.

Subscribe to this fee journal for more curated articles on this topic

Cybersecurity & Data Privacy Law & Policy eJournal

African law ejournal.

  • Research Process
  • Manuscript Preparation
  • Manuscript Review
  • Publication Process
  • Publication Recognition
  • Language Editing Services
  • Translation Services

Elsevier QRcode Wechat

Confidentiality and Data Protection in Research

  • 4 minute read
  • 55.1K views

Table of Contents

Data protection issues in research remain at the top of researchers’ and institutional awareness, especially in this day and age where confidential information can be hacked and disseminated. When you are conducting research on human beings, whether its clinical trials or psychological inquiries, the importance of privacy and confidentiality cannot be understated. In the past, it was as easy as a lockable file cabinet. But now, it’s more and more challenging to maintain confidentiality and data protection in research.

In this article, we’ll talk about the implications of confidentiality in research, and how to protect privacy and confidentiality in research. We’ll also touch on ways to secure electronically stored data, as well as third-party data protection services.

Data Protection and Confidentiality in Research

How can you protect privacy and confidentiality in research? The answer, in some ways, is quite simple. However, the means of protecting sensitive data can often, by design, be complex.

In the research time, the Principal Investigator is ultimately responsible for the integrity of the stored data. The data protections and confidentiality protocols should be in place before the project starts, and includes aspects like theft, loss or tampering of the data. The easy way to do this is to limit access to the research data. The Principal Investigator should limit access to this information to the fewest individuals possible, including which research team members are authorized to manage and access any data.

For example, any hard-copies of notebooks, questionnaires, surveys and other paper documentation should be kept in a secure location, where there is no public access. A locked file cabinet, away from general access areas of the institution, for instance. Names and other personal information can be coded, with the encoding key kept in a separate and secure location.

It is the Principal Investigator’s responsibility to make sure that every member of the research team is fully trained and educated on the importance of data protection and confidentiality, as well as the procedures and protocols related to private information.

Check more about the Team Structure and Responsibilities .

Implications of Confidentiality in Research

Even if paper copies of questionnaires, notes, etc., are stored in a safe, locked location, typically all of that information is also stored in some type of electronic database. This fulfills the need to have data available for statistical analysis, as well as information accessible for developing conclusions and implications of the research project.

You’ve certainly heard about the multitude of data breaches and hacks that occur, even in highly sophisticated data protection systems. Since research projects can often involve data around human subjects, they can also be a target to hackers. Restoring, reproducing and/or replacing data that’s been stolen, including the time and resources needed to do so, can be prohibitively expensive. That doesn’t even take into consideration the cost to the human subjects themselves.

Therefore, it’s up to the entire research team to ensure that data, especially around the private information of human beings, is strongly protected.

How Can Electronic Data Be Protected?

Frankly, it’s easier said than done to ensure confidentiality and the protection of research data. There are several well-established protocols, however, that can guide you and your team:

  • Just like for any hard-copy records, limit who has access to any electronic records to the bare minimum
  • Continually evaluate and limit access rights as the project proceeds
  • Protect access to data with strong passwords that can’t be easily hacked, and have those passwords change often
  • Access to data files should be done through a centralized, protected process
  • Most importantly, make sure that wireless devices can’t access your data and your network system
  • Protect your data system by updating antivirus software for every computer that has access to the data and confidential information
  • If your data system is connected via the cloud, use a very strong firewall, and test it regularly
  • Use intrusion detection software to find any unauthorized access to your system
  • Utilize encryption software, electronic signatures and/or watermarking to keep track of any changes made to data files and authorship
  • Back up any and all electronic databases (on and offsite), and have hard and soft copies of every aspect of your data, analysis, etc.
  • When applicable, make sure any data is properly and completely destroyed

Check more about: Why Manage Research Data?

Using Third-Party Data Protection Services

If your institution does not have built-in systems to assure confidentiality and data protection in research, you may want to consider a third party. An outside information technology organization, or a team member specifically tasked to ensure data protection, might be a good idea. Also look into different protections that are often featured within database programs themselves.

Elsevier Author Services

Helping you publish your research is our job. If you need assistance with translating services, proofreading, editing, graphics and illustrations services, look no further than Elsevier Author Services .

Research Team Structure

Research Team Structure

Research Data Storage and Retention

Research Data Storage and Retention

You may also like.

what is a descriptive research design

Descriptive Research Design and Its Myriad Uses

Doctor doing a Biomedical Research Paper

Five Common Mistakes to Avoid When Writing a Biomedical Research Paper

Writing in Environmental Engineering

Making Technical Writing in Environmental Engineering Accessible

Risks of AI-assisted Academic Writing

To Err is Not Human: The Dangers of AI-assisted Academic Writing

Importance-of-Data-Collection

When Data Speak, Listen: Importance of Data Collection and Analysis Methods

choosing the Right Research Methodology

Choosing the Right Research Methodology: A Guide for Researchers

Why is data validation important in research

Why is data validation important in research?

Writing a good review article

Writing a good review article

Input your search keywords and press Enter.

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

  • Open access
  • Published: 26 August 2024

Public attitudes towards personal health data sharing in long-term epidemiological research: a Citizen Science approach in the KORA study

  • Ina-Maria Rückert-Eheberg 1 ,
  • Margit Heier 1 , 2 ,
  • Markus Simon 1 ,
  • Monika Kraus 1 , 3 ,
  • Annette Peters 1 , 3 , 4 , 5 &
  • Birgit Linkohr 1 , 3  

BMC Public Health volume  24 , Article number:  2317 ( 2024 ) Cite this article

72 Accesses

Metrics details

Loss to follow-up in long-term epidemiological studies is well-known and often substantial. Consequently, there is a risk of bias to the results. The motivation to take part in an epidemiological study can change over time, but the ways to minimize loss to follow-up are not well studied. The Citizen Science approach offers researchers to engage in direct discussions with study participants and to integrate their opinions and requirements into cohort management.

Guided group discussions were conducted with study participants from the KORA cohort in the Augsburg Region in Germany, established 40 years ago, as well as a group of independently selected citizens. The aim was to look at the relevant aspects of health studies with a focus on long-term participation. A two-sided questionnaire was developed subsequently in a co-creation process and presented to 500 KORA participants and 2,400 employees of the research facility Helmholtz Munich.

The discussions revealed that altruistic motivations, (i.e. supporting research and public health), personal benefits (i.e. a health check-up during a study examination), data protection, and information about research results in layman’s terms were crucial to ensure interest and long-term study participation. The results of the questionnaire confirmed these aspects and showed that exclusively digital information channels may be an obstacle for older and less educated people. Thus, paper-based media such as newsletters are still important.

Conclusions

The findings shed light on cohort management and long-term engagement with study participants. A long-term health study needs to benefit public and individual health; the institution needs to be trustworthy; and the results and their impact need to be disseminated in widely understandable terms and by the right means of communication back to the participants.

Peer Review reports

In a long-term prospective cohort study, the motivation of people to participate over an extended period and trustfully share their health data is essential to investigating causal relationships between health and disease in constantly changing environments. However, loss-to-follow up, i.e. declining willingness to take part in follow-up examinations and questionnaires, is a major problem in all long-term prospective cohort studies [ 1 , 2 , 3 ], raising questions about the generalizability of results [ 4 ]. Information on the reasons to participate is often gathered at the initial sign-up of the study by short non-participant questionnaires [ 5 , 6 , 7 ], satisfaction polls after the study examination for internal conduct improvement, witness statements [ 8 ] or by chance when study participants comment to staff or leave remarks in questionnaires. Non-participants often report acute health problems or stressful life-events, but also unspecific reasons like lack of interest or time constraints. In good epidemiological practice, efforts to characterize the loss of follow-up during analysis are made [ 9 ] and particular groups can be identified, e.g. less educated groups or middle-aged men, depending on the cohort [ 10 , 11 ]. However, cohort management should seek to maximize participation in follow-up studies in the first place by trying to meet participants’ expectations. Personal attitudes towards data sharing may change during long-term studies, particularly in the light of the experience of the COVID-19 pandemic. To our knowledge, systematic research into cohort management strategies in long-term epidemiological studies is rare.

Citizen Science, also called “participatory research,” has increasingly been supported by public organizations in and outside of academic institutions to meet information requirements, increase transparency, and improve people’s attitudes towards science [ 12 ]. In 2022, the White Paper “Citizen Science Strategy 2030 for Germany” was published that comprehensively informs about Citizen Science, action areas, networking, funding, volunteer management, and many other aspects [ 13 ]. Meanwhile, a wide range of scientific projects covering all areas of interest are offered to the public [ 14 , 15 , 16 ]. Participatory research strategies have been introduced into health research in various initiatives (e.g. [ 17 ]) with the overarching goal “to reduce concerns about the use of data through intensive exchange with interested citizens and to demonstrate the opportunities it offers” [ 18 ]. Citizen Science in public health can be characterized by typology according to aim, approach, and size, depending on the level of engagement with the community [ 19 ].

Recently, Marcs et al. published a scoping review on Citizen Science approaches in chronic disease prevention where they used Citizen Science to identify problems from the perspective of community members, generate and prioritize solutions, develop, test and/or evaluate interventions, and/or build community capacity [ 20 ]. Frameworks for a systematic development of participatory epidemiology have also been proposed [ 21 ].

Our aim was to employ Citizen Science approaches to engage in direct discussion with study participants from a well-established epidemiological study to evaluate how to maximise study participation long-term by high response rates and low subsequent withdrawal of consent. We were particularly interested in the reasons for continuing to take part in follow-up studies as well as concerns and wishes regarding the collection and use of health data. The research methods combined Citizen Science approaches like qualitative research and co-design elements with a classical quantitative approach in a nested but work-efficient study design. The project was conducted in a randomly selected subgroup of participants of a long-term prospective cohort study and, for comparison, a group of independent citizens and employees of a large health research institution.

The Citizens Science project was embedded in the KORA study (Cooperative Health Research in the Region of Augsburg), an adult population-based prospective cohort study established in 1984 in the City of Augsburg and the adjacent rural counties Augsburg and Aichach-Friedberg in Southern Germany [ 22 ]. Briefly, the KORA-study consists of four cross-sectional baseline surveys (S1 from 1984/85 with N  = 4,022 (response: 79.3%); S2 from 1989/90 with N  = 4,940 (response: 76.9%); S3 from 1994/95 with 4,856 (response: 74.9%); and S4 from 1999/2001 with 4,261 (response: 66.8%)). The participants were randomly selected from population registries aged 25–74 years (S1: 25–64 years). The KORA study is still in active follow-up with a KORA study centre located in the City of Augsburg. A general health survey was sent out in 2021 to all S1 to S4 participants still living in the study area and with consent for recontact. 6,070 out of 9,109 participants answered the survey (66.6%).

The starting point of the project was qualitative research with three guided discussion groups: two with KORA study participants and one with newly recruited citizens. In a co-creation process at a subsequent meeting, a questionnaire was developed with a smaller group of volunteers from the discussion groups. For the quantitative part of the study, this questionnaire was mailed to participants of the KORA study and distributed to all employees of Helmholtz Munich.

Discussion groups

During the preparation of the study setup, a pilot discussion group was conducted with seven acquaintances of the involved scientists. For the two discussion groups with KORA volunteers, 183 KORA study participants were selected (criteria: 50% women, 50% participants of the latest KORA general health survey 2021 with online survey completion and 50% paper-based completion, born 1949–1969, residing in Augsburg or nearby). They were invited in writing by post and contacted by telephone. Citizens were recruited via a newsletter advertisement of the Volunteer Centre Augsburg [ 23 ], and posters and flyers that were distributed in shops, restaurants, the library, the University Hospital Augsburg, and other public places in Augsburg. To compensate expenses, e.g. for travelling, we paid a small expense allowance.

The discussions took place between May and June 2023 in the KORA Study Centre in Augsburg. Following a short impulse presentation on the KORA study, the attendees were asked to note their motivations, concerns, and wishes regarding the participation in a long-term observational health study separately on index cards. The number of cards was not specified. The participants had the opportunity to present each card to the group before it was displayed on a whiteboard sorted by the respective category. Guided by two moderators, the raised aspects were discussed in greater depth along with a set of prepared questions. To provide more information on data privacy and protection in the KORA study, the consent form and study information from the most recent KORA general health survey in 2021 were distributed. Each discussion group lasted about 90 min and was rounded up with a little get-together at the end. The discussions were audiotaped with Audacity ® 3.2.5 and a microphone of the conference system Logitech CC3000e ConferenceCam and transcribed subsequently. In the aftermath, the index cards were coded according to reoccurring themes. One of the authors, who was part of all three discussion groups, developed a coding scheme with the help of the audiotapes. The scheme was reviewed by another author who was not present at the discussions, and consensus was found in terms of discrepant interpretation. Anonymized quotes were selected and translated for publication purposes.

Questionnaire development

The discussion group participants were invited to a subsequent meeting to develop the questionnaire together with the researchers in a co-creation process. The aim was to recruit six volunteers (two per group) to discuss a prepared questionnaire draft in the light of the results from the discussion groups. The questionnaire was designed for mailing to the KORA study participants first and modified slightly for the employees of Helmholtz Munich thereafter. It consisted of questions on the three pre-defined categories motivations, concerns, and wishes and a section on personal data such as sex, age, and school education. Many of the questions were formatted as 5-point Likert scales.

The questionnaire was piloted at the Institute of Epidemiology, and the final version was also translated into English for the Helmholtz Munich employees (Supplement).

Questionnaire survey

The paper version was posted to 500 selected KORA participants, equally balanced by sex. They were randomly chosen from the KORA S1-S3 studies from a total of N  = 2,933 participants born between 1964 and 1945, still living in the study area, and with consent for recontact. 400 of them had taken part in the latest KORA general health survey in 2021, while 100 had not. The approximately 2,400 Helmholtz employees were invited to complete the questionnaire personally on paper in the canteen on campus or online (in PDF format).

Ethics approval and consent to participate

All discussion group participants gave their written informed consent to take part in the discussions. The questionnaire was conducted anonymously, and no written informed consent was required. This study protocol was approved by the ethics committee of the Bavarian Medical Association (EC 23010).

Statistical analysis

The data from the completed questionnaires was transferred to a database and analyzed primarily with R and RStudio (Boston, MA, USA). Characteristics of the qualitative study groups were reported with absolute numbers, and characteristics of the quantitative questionnaire study population with numbers and percentages. The R-package „Likert“ was used to create Likert scale charts (Figs.  1 and 2 ). Percentages were calculated to sum the two categories “not important” and “not very important”, and the two categories “important” and “very important”, respectively. The category “neutral” was also visualized, and the percentages were given. Figure  3 was set up in Excel. Percentages were calculated and displayed by education level after exclusion of participants with missing information on education ( N  = 1) and those who had no school-leaving certificate ( N  = 2). Significance tests were not performed because the statistics were descriptive and not adjusted for confounding factors.

figure 1

Reasons to participate in the KORA study or a long-term health study. Percentages on the left represent purple responses, percentages on the right represent green responses

figure 2

Concerns about data protection, linkage of study data with secondary health information, and use for non-public research. Percentages on the left represent green responses, percentages on the right represent red responses

figure 3

Preferred information channels to disseminate research results of the KORA study, stratified by school education

Twenty-four people participated in the three discussion groups (17 probands of the KORA study, 7 citizens, 11 women, and 13 men). Their age range was 42 to 78 years (mean age: 65 years). 14 people reported high (12–13 years), 9 intermediate (10 years), and one person low (9 years) school education.

Table  1 shows the results of the group discussions stratified by category. There was no major difference between KORA participants and the citizen group. Most ideas were raised in the category motivations, followed by wishes and concerns. We excluded statements that went beyond the scope of a health study (concerns: general criticism of the health system (3x) and study staff would not listen (1x); wishes: individual health advice (8x) and contact between participants (2x)).

The number of people who referred to one of the aspects listed in the table is depicted in column N.

For many volunteers, a motivation to take part in the KORA study or a health study in general was the free preventive medical check-up in the form of the study examinations.

Discussion Group 1 , KORA participant: “So , my motivation to join was to get information about my health that I wouldn’t have gotten otherwise.”

Additionally, the discussants placed great importance on the benefits for the public, their contribution to health research, and their interest in it.

Discussion Group 3 , KORA participant: “In terms of motivation , the focus is , of course , quite clearly on the fact that the benefit is for the general public.” Discussion Group 1 , KORA participant: “And then , of course , that one contributes to general research.”

The professional conduct of the study was also mentioned several times.

The participants raised fewer issues in the category concerns than in the categories motivations and wishes. The main aspects were protection and security of health data in KORA or generally in health studies.

Discussion Group 1 , KORA participant: “My concerns are (…) data protection and data usage. Not particularly in relation to Helmholtz Munich , but the overall (…) misuse , data hackers , cybercrime , all that stuff. And that will increase even more in the future.” Discussion Group 2 , Citizen: “…it is always difficult with data protection in an international comparison. We have very high standards here , but can we maintain them in the long term? Because , of course , we also create barriers that are incomprehensible to others.”

Some of these concerns were not directed at the discussants themselves but rather at younger people who might suffer greater harm through misuse. Discrimination in professional life or when taking out insurance were mentioned as examples in this context.

Discussion Group 1 , KORA participant: “Personally , I wouldn’t mind (…) , but with younger , working people , I would probably have a different opinion. Because today , you can supposedly already say that people might get certain diseases at some point. (…) And I think that is dangerous if this information goes to the insurance companies or to the employers themselves (…).”

The participants did express their trust in Helmholtz Munich as a publicly funded research institution, and the consent form and study information were considered informative and clear; some participants even found them too detailed.

A minority of the participants had no worries whatsoever.

Discussion Group 3 , KORA participant: “I really can’t say anything about concerns. If my data were published with my name , I wouldn’t care at all.”

In the category wishes, the participants pointed out that more communication on study results and their translation into the health care system would motivate them long-term to participate in a study.

Discussion Group 2 , Citizen: “(…) the research results must be disseminated more widely. In my opinion , they have primarily been intended for experts.” Discussion Group 2 , Citizen: “I find the contributions on the Internet (…) terrible. The layperson gets all mixed up. You’d have to clean up that mess , too.”

Many participants indicated that simple, brief, and comprehensible communication was appreciated. Some discussants preferred digital formats, while others explicitly stated that they wanted paper-based communication only. Overall, the discussion group participants were open to health research and were interested in more frequent examinations and additional study offers.

A two-page questionnaire was developed in a meeting between two out of the 24 discussion group participants and two researchers. The participants pointed out some complicated questions and assessed the overall comprehensibility.

The survey was completed by 278 KORA participants (response rate: 67% in those who had participated in the latest KORA follow-up and 9% in those who had not participated) and 285 Helmholtz Munich employees (response rate: about 12% as the exact number of employees was not available), resulting in a total study population of 563 people. The characteristics of the study population are displayed in Table  2 . Approximately the same number of women and men took part in the survey. The KORA study participants were between 58 and 78 years old (mean age: 67.9 years). The Helmholtz Munich employees were younger, mostly between 20 and 50 years old (mean age: 39.8 years). About one-third of the KORA participants had low (9 years), intermediate (10 years), and high (12–13 years) levels of school education. In contrast, most of the Helmholtz Munich employees (89.2%) had a high level of education. 71.4% of the Helmholtz Munich employees worked scientifically, and 70.4% had German citizenship.

In the questionnaire, participants were asked how important they rated the three listed reasons to participate in the KORA study or a long-term health study (Fig.  1 ). The answers of the KORA study participants and the Helmholtz employees were very similar. A majority of about 90% deemed “contributing to health research” and “benefits for the general public” as very important or important. “Free comprehensive medical check-ups” were also seen as important or very important by about 70%, while about 20% took a neutral position on this aspect.

Differences between the two participant groups were found regarding questions about concerns in relation to data protection and data linkage (Fig.  2 ). Only a small proportion of the KORA study participants had reservations about data protection in the KORA study (3%). Concerns or strong concerns increased with regards to linking their study data to secondary health data such as diagnoses by their physicians (7%), prescription and treatment data by their health insurance (14%), but it decreased with regards to the cause of death sometime in the future (7%). In comparison, 35% of the Helmholtz Munich employees had concerns or strong concerns about data protection in a long-term health study. Data linkage was seen critically by 35% regarding study and physician diagnosis data, by 41% regarding study and health insurance data, and by 17% regarding study and death certificate data.

A larger proportion in both groups (29% of the KORA participants and 57% of the Helmholtz Munich employees) indicated concerns or strong concerns about the utilization of their health data by non-public research organizations.

The KORA participants were asked how they would like to be informed about the research results of the KORA study. Multiple selections were allowed. Figure  3 shows the percentages stratified by school education. Participants with a high level of school education preferred digital channels such as electronic newsletters and websites, in contrast to participants with low or intermediate school education, who preferred information, i.e. newsletters by paper mail. About 20% of each group indicated that they would appreciate coverage of scientific research results via newspapers, radio, and TV, while books were only interesting for a small proportion of participants. Less than 10% did not wish for any information. Of the 147 participants who chose a newsletter by paper mail, 20% also selected a newsletter by email, and 4% also selected the website category – thus, 77% of those who chose paper mail wanted no digital information.

Using Citizen Science approaches, this project examined the motivations, concerns, and wishes of research participants to help slow down the decline in follow-up study participation. The KORA study was established almost four decades ago and is still in active follow-up with relatively high response rates, e.g. 64% in an examination in 2018/19 [ 24 ] and 66.7% in a general health survey in 2021. Longitudinal data is particularly informative for life-course health research, but few studies exist on how to keep up motivation in follow-up studies. The findings from the discussion groups and the questionnaire survey showed that participants can be motivated to provide their personal health data for scientific purposes over long periods of time if their expectations are met. Three main reasons to participate in a long-term health study were identified: the benefit to the public, scientific progress, and personal health. Those findings are consistent with a previous study led by KORA scientists in 2010 on the public perceptions of cohort studies and biobanks during the recruitment phase of the German National Cohort (NAKO) [ 25 ]. They found that in general, citizens approve epidemiological research based on expectations for communal and individual benefits (e.g., health check-ups and health information). This shows that the basic motivation for study participation does not change between study initiation and long-term follow-up. Collaboration with science [ 26 ], making a contribution to society [ 27 ], and receiving information about personal health [ 28 ] have also been known as motivations for study participation in clinical studies. In a recent study on retaining participants in longitudinal studies of Alzheimer’s disease, altruism and personal benefit were the factors associated with continued study participation as well [ 29 ].

In the discussion groups, data protection did not come up as a major concern and was not necessarily directed at the KORA study. In the questionnaire, participants had no strong concerns about their data in the KORA study, even for data linkage. This is in line with the findings by Bongartz et al. that the trustworthiness of those conducting research appeared to be most important for the decision to participate in a health-related study [ 30 ]. However, Helmholtz Munich employees expressed more concerns with regards to data protection and data linkage. A likely interpretation for this difference is that KORA participants referred to a specific study that they had a lot of experience with, while Helmholtz employees imagined some theoretical long-term health study. Moreover, the Helmholtz employees were, on average, younger, higher educated, and probably more informed about data protection and data security risks. Our findings showed that institutional trust is essential for long-term participation in a health study. Once trust is gained at initial sign-up, it is important to maintain it. The comprehensive study by Tommel et al. also supports the importance of trust [ 31 ]. They explored citizens and healthcare professionals’ perspectives on personalized genomic medicine and personal health data spaces in questionnaires and interviews. Cohort management can help maintain trust, but overall satisfaction with the health system, public health policy, or pandemics is outside its scope.

About one-third of the KORA participants and about two-thirds of the Helmholtz Munich employees expressed concern about sharing data with non-public research organizations. This is in line with findings that people are generally prepared to participate in epidemiological research if it is conducted by a trusted public institution, but that there is widespread distrust of research conducted or sponsored by pharmaceutical companies [ 32 , 33 ]. However, this degree of concern in both groups was somewhat surprising, as most KORA participants had given consent to sharing their data with industry previously, and Helmholtz Munich contributes to the translation of research into medical innovation with commercial partners.

The discussion group participants wished to be informed about the results and impact of the research in a generally understandable format. The information should be addressed to them personally, such as through a newsletter, rather than in the press, TV, or the internet. A notable proportion of the KORA participants wished to be informed via non-digital means. This is an important finding for those running population-based studies such as the German National Cohort [ 34 ] and their financing bodies. While the finding may be specific to the setting in Southern Germany and a long-term cohort study with aged participants, it is important to monitor the information preferences. In addition to digital tools, paper-based methods are still needed for many more years to not lose large groups of the general population. Future research should focus particularly on the digital readiness of older citizens, so that cohort management strategies can engage participants at their level. In long-term health studies, morbidity and mortality are often relevant health outcomes. Public health policies that enable secondary data linkage could also compensate for loss to follow-up and limit selection bias.

Strengths and limitations

A strength of this project is its diverse group of participants, which includes stakeholders from a long-term epidemiological study, independent citizens, and staff from a research institution earning their living in health science research.

The discussion groups were structured but allowed participants to explain their own narratives and introduce new issues. The questionnaire was administered to two very different groups of participants, and in part, similar results were obtained that confirmed each other (i.e., important motivations to take part in health research (Fig.  1 )).

With respect to limitations, a Citizen Science project depends on participants who are interested and motivated to take part. It is quite difficult to find enough participants, and only 24 discussion group volunteers do not necessarily represent the “general” public, especially as discussants with low education were underrepresented. Participants living in rural areas were completely absent due to the recruitment strategy that they had to live in a reasonable travel distance from the KORA study center. The dates and times of the discussion groups were fixed by the researchers and probably discouraged very busy people. However, the small fee and snacks seemed to motivate some of the participants with lower economic status to take part.

In addition, it cannot be ruled out that the ideas of the discussants as well as the answers of the questionnaire survey were influenced by social desirability, perhaps on a subconscious level, and people might thus act somewhat differently in real life than they indicated they would in a theoretical setting. In a group discussion, participants may give answers that they believe to be expected and that will please the interviewer or moderator. Social desirability bias was certainly less of an issue in the questionnaire survey as it was anonymous. However, the outcomes of the discussion groups generally agreed with the responses to the questionnaire given to KORA participants. This questionnaire represents the views of a pre-selected group of people who were recruited up to forty years ago and who still consent to be contacted again for follow-up research. The response to the questionnaire by the KORA participants was as expected: It was high among those who had participated in the latest KORA general health survey in 2021, but it was very low in those who did not participate at the time. This shows that participants who are lost to follow-up are difficult to re-engage.

Finally, the development of the questionnaire was intended to be a co-creation process between selected discussion group participants and scientists. However, the interest of the discussion group members in co-creation was low, and only two participants were willing to take part in this process. They improved the comprehensibility of the questionnaire draft but saw themselves clearly as contributors rather than co-creators. A successful co-creation process requires more capacity building than was possible in this project. As Laird et al. pointed out, Citizen Science approaches often face barriers like building up longer-term collaborative relationships, and their implementation is often time and resource constrained [ 35 ].

The Citizen Science approach opens a new possibility to get in touch with study participants more closely and to integrate their opinions and requirements into cohort management.

On the one hand, people are altruistically motivated when they decide to take part in a long-term health study, and they enjoy the possibility to contribute to public benefit and scientific progress. On the other hand, they also see benefits for their personal health. Concerns do not seem to prevail. Feedback in layman’s terms on the long-term results of the study is highly appreciated and should be addressed to the participant personally.

Cohort management should include regular feedback of results as a thank you for the data donation and contribution to society.

In other words, a long-term health study needs to benefit public and individual health, to be trustworthy regarding data protection and data use, and to provide long-term research results in generally understandable terms and in the preferred communication mode back to the participants.

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request via the application tool KORA.passt ( https://helmholtz-muenchen.managed-otrs.com/external/ ).

Abbreviations

Cooperative Health Research in the Region of Augsburg

Osler M, Linneberg A, Glumer C, Jorgensen T. The cohorts at the Research Centre for Prevention and Health, formerly ‘The Glostrup Population studies’. Int J Epidemiol. 2011;40(3):602–10.

Article   PubMed   Google Scholar  

Rabel M, Meisinger C, Peters A, Holle R, Laxy M. The longitudinal association between change in physical activity, weight, and health-related quality of life: results from the population-based KORA S4/F4/FF4 cohort study. PLoS ONE. 2017;12(9):e0185205.

Article   PubMed   PubMed Central   Google Scholar  

Volzke H, Schossow J, Schmidt CO, Jurgens C, Richter A, Werner A, et al. Cohort Profile Update: the study of Health in Pomerania (SHIP). Int J Epidemiol. 2022;51(6):e372–83.

Zivadinovic N, Abrahamsen R, Pesonen M, Wagstaff A, Toren K, Henneberger PK, et al. Loss to 5-year follow-up in the population-based Telemark Study: risk factors and potential for bias. BMJ Open. 2023;13(3):e064311.

Hoffmann W, Terschüren C, Holle R, Kamtsiuris P, Bergmann M, Kroke A, et al. [The problem of response in epidemiologic studies in Germany (Part II)]. Gesundheitswesen. 2004;66(8–9):482–91.

Article   PubMed   CAS   Google Scholar  

Enzenbach C, Wicklein B, Wirkner K, Loeffler M. Evaluating selection bias in a population-based cohort study with low baseline participation: the LIFE-Adult-study. BMC Med Res Methodol. 2019;19(1):135.

Holle R, Hochadel M, Reitmeir P, Meisinger C, Wichmann HE. Prolonged recruitment efforts in health surveys: effects on response, costs, and potential bias. Epidemiology. 2006;17(6):639–43.

NaKo - Botschafter. https://nako.de/studie/nako-botschafter/ . Accessed 07 May 2024.

Nohr EA, Liew Z. How to investigate and adjust for selection bias in cohort studies. Acta Obstet Gynecol Scand. 2018;97(4):407–16.

Powers J, Tavener M, Graves A, Loxton D. Loss to follow-up was used to estimate bias in a longitudinal study: a new approach. J Clin Epidemiol. 2015;68(8):870–6.

Kendall CE, Raboud J, Donelle J, Loutfy M, Rourke SB, Kroch A, et al. Lost but not forgotten: a population-based study of mortality and care trajectories among people living with HIV who are lost to follow-up in Ontario, Canada. HIV Med. 2019;20(2):88–98.

European Citizen Science Platform. https://eu-citizen.science. Accessed 07 May 2024.

Bonn A, Brink W, Hecker S, Herrmann TM, Liedtke C, Premke-Kraus M, Voigt-Heucke S et al. White Paper Citizen Science Strategy 2030 for Germany ( https://www.mitforschen.org/sites/default/files/grid/2024/07/24/White_Paper_Citizen_Science_Strategy_2030_for_Germany.pdf) 2022.

mit:forschen!. Gemeinsam Wissen schaffen (ehemals Bürger schaffen Wissen). www.buergerschaffenwissen.de. Accessed 07 May 2024.

Zooniverse. People-powered research. www.zooniverse.org. Accessed 07 May 2024.

European Commission - Marie. Skłodowska-Curie Actions. https://marie-sklodowska-curie-actions.ec.europa.eu/news/marie-sklodowska-curie-actions-funds-44-projects-to-bring-research-closer-to-education-and-society-across-europe . Accessed 08 May 2024.

Schütt AM-F, Weschke E. Sarah. Aktive Beteiligung Von Patientinnen Und Patienten in Der Gesundheitsforschung. Eine Heranführung für (klinisch) Forschende. Bonn/Berlin: DLR Projektträger; 2023.

Google Scholar  

NFDI4Health. Nationale Forschungsdateninfrastruktur für personenbezogene Gesundheitsdaten. https://www.nfdi4health.de . Accessed 07 May 2024.

Den Broeder L, Devilee J, Van Oers H, Schuit AJ, Wagemakers A. Citizen Science for public health. Health Promot Int. 2018;33(3):505–14.

PubMed   Google Scholar  

Marks L, Laird Y, Trevena H, Smith BJ, Rowbotham S. A scoping review of Citizen Science Approaches in Chronic Disease Prevention. Front Public Health. 2022;10:743348.

Bach M, Jordan S, Hartung S, Santos-Hovener C, Wright MT. Participatory epidemiology: the contribution of participatory research to epidemiology. Emerg Themes Epidemiol. 2017;14:2.

Holle R, Happich M, Lowel H, Wichmann HE, Group MKS. KORA–a research platform for population based health research. Gesundheitswesen. 2005;67(Suppl 1):S19–25.

Freiwilligen-Zentrum Augsburg. https://www.freiwilligen-zentrum-augsburg.de/ . Accessed 07 May 2024.

Rooney JP, Rakete S, Heier M, Linkohr B, Schwettmann L, Peters A. Blood lead levels in 2018/2019 compared to 1987/1988 in the German population-based KORA study. Environ Res. 2022;215(Pt 1):114184.

Starkbaum J, Gottweis H, Gottweis U, Kleiser C, Linseisen J, Meisinger C, et al. Public perceptions of cohort studies and biobanks in Germany. Biopreserv Biobank. 2014;12(2):121–30.

Costas L, Bayas JM, Serrano B, Lafuente S, Muñoz MA. Motivations for participating in a clinical trial on an avian influenza vaccine. Trials. 2012;13:28.

Richter G, Krawczak M, Lieb W, Wolff L, Schreiber S, Buyx A. Broad consent for health care-embedded biobanking: understanding and reasons to donate in a large patient sample. Genet Med. 2018;20(1):76–82.

Akmatov MK, Jentsch L, Riese P, May M, Ahmed MW, Werner D, et al. Motivations for (non)participation in population-based health studies among the elderly - comparison of participants and nonparticipants of a prospective study on influenza vaccination. BMC Med Res Methodol. 2017;17(1):18.

Gabel M, Bollinger RM, Coble DW, Grill JD, Edwards DF, Lingler JH, et al. Retaining participants in Longitudinal studies of Alzheimer’s Disease. J Alzheimers Dis. 2022;87(2):945–55.

Bongartz H, Rübsamen N, Raupach-Rosin H, Akmatov MK, Mikolajczyk RT. Why do people participate in health-related studies? Int J Public Health. 2017;62(9):1059–62.

Tommel J, Kenis D, Lambrechts N, Brohet RM, Swysen J, Mollen L et al. Personal Genomes in Practice: Exploring Citizen and Healthcare Professionals’ Perspectives on Personalized Genomic Medicine and Personal Health Data Spaces Using a Mixed-Methods Design. Genes (Basel). 2023;14(4).

Slegers C, Zion D, Glass D, Kelsall H, Fritschi L, Brown N, et al. Why do people participate in epidemiological research? J Bioeth Inq. 2015;12(2):227–37.

Richter G, Borzikowsky C, Lesch W, Semler SC, Bunnik EM, Buyx A, et al. Secondary research use of personal medical data: attitudes from patient and population surveys in the Netherlands and Germany. Eur J Hum Genet. 2021;29(3):495–502.

Peters A, German National Cohort C, Peters A, Greiser KH, Gottlicher S, Ahrens W, et al. Framework and baseline examination of the German National Cohort (NAKO). Eur J Epidemiol. 2022;37(10):1107–24.

Article   PubMed   PubMed Central   CAS   Google Scholar  

Laird Y, Marks L, Smith BJ, Walker P, Garvey K, Jose K et al. Harnessing citizen science in health promotion: perspectives of policy and practice stakeholders in Australia. Health Promot Int. 2023;38(5).

Download references

Acknowledgements

We thank all participants of the discussion groups and the questionnaire survey for their contributions, the staff for data collection and research data management, and the members of the KORA Study Group (https://www.helmholtz-munich.de/en/epi/cohort/kora) who are responsible for the design and conduct of the KORA study.

The KORA study was initiated and financed by the Helmholtz Zentrum München – German Research Center for Environmental Health, which is funded by the German Federal Ministry of Education and Research (BMBF) and by the State of Bavaria. Data collection in the KORA study is done in cooperation with the University Hospital of Augsburg. The project was supported by the NFDI4Health (National Research Data Infrastructure for Personal Health Data) citizen-science 2023 initiative to support participatory research ( https://www.nfdi4health.de/community/citizen-science.html ).

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and affiliations.

Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Munich, Germany

Ina-Maria Rückert-Eheberg, Margit Heier, Markus Simon, Monika Kraus, Annette Peters & Birgit Linkohr

KORA Study Centre, University Hospital of Augsburg, Augsburg, Germany

Margit Heier

German Centre for Cardiovascular Research (DZHK e.V.), Munich Heart Alliance, Munich, Germany

Monika Kraus, Annette Peters & Birgit Linkohr

Institute for Medical Information Processing, Biometry and Epidemiology (IBE), Faculty of Medicine, Ludwig-Maximilians-Universität München, Munich, Germany

Annette Peters

Partner Site München-Neuherberg, German Center for Diabetes Research (DZD), Munich-Neuherberg, Germany

You can also search for this author in PubMed   Google Scholar

Contributions

IMRE contributed to the conception, design, and conduct of the study, analyzed and interpreted the data, and drafted the manuscript. MH contributed to the conception, design, and conduct of the study, interpreted the data, and revised the manuscript. MS contributed to the design and conduct of the study, interpreted the data, and revised the manuscript. MK contributed to the design and conduct of the study, interpreted the data, and revised the manuscript. AP contributed to the conception and design of the study, interpreted the data, and revised the manuscript. BL contributed to the conception, design, and conduct of the study, interpreted the data, and drafted the manuscript. All authors read and approved the final manuscript. They agree to be accountable for their own contributions and that questions that may arise on the accuracy or integrity of the work will be appropriately investigated, resolved, and documented.

Corresponding author

Correspondence to Ina-Maria Rückert-Eheberg .

Ethics declarations

Consent for publication.

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Rückert-Eheberg, IM., Heier, M., Simon, M. et al. Public attitudes towards personal health data sharing in long-term epidemiological research: a Citizen Science approach in the KORA study. BMC Public Health 24 , 2317 (2024). https://doi.org/10.1186/s12889-024-19730-0

Download citation

Received : 17 May 2024

Accepted : 08 August 2024

Published : 26 August 2024

DOI : https://doi.org/10.1186/s12889-024-19730-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Citizen science
  • Participatory research
  • Public engagement
  • Health data sharing
  • Epidemiological cohort management
  • Co-creation

BMC Public Health

ISSN: 1471-2458

research paper on data protection

Transformation of the AECO industry through the metaverse: potentials and challenges

  • Open access
  • Published: 27 August 2024
  • Volume 6 , article number  461 , ( 2024 )

Cite this article

You have full access to this open access article

research paper on data protection

  • Hannah Claßen 1 ,
  • Niels Bartels   ORCID: orcid.org/0000-0001-7517-3422 1 ,
  • Urs Riedlinger   ORCID: orcid.org/0000-0002-7406-5486 2 &
  • Leif Oppermann   ORCID: orcid.org/0000-0002-8920-0986 2  

The integration of the metaverse into the real estate and construction industry reveals various potentials, but also challenges. The increasing digitization in the architecture, engineering, construction, and operation (AECO) sector requires a critical examination of aspects such as the metaverse. This paper is dedicated to examining the impact of the metaverse on the real estate and construction industry. The following specialist article is primarily aimed at the target group of the AECO sector, with the aim of gaining an initial overview of the opinions within this sector. The methodology used includes an in-depth literature review and a representative survey. Respondents from different age groups and areas of activity within the construction and real estate industry took part in this survey. The research questions of this paper are aimed at identifying the range of metaverse applications in the AECO industry, assessing their potential impact on business potential and challenges. The aim is to develop initial definitions and use cases and to create an overview of opinions in the industry. In this context, potential opportunities and risks will be examined to derive recommendations for an effective integration of the metaverse into the AECO industry. The results of this paper conducted indicate that there is still considerable uncertainty in the construction and real estate industry. It appears that the term “metaverse” and the associated potential through targeted use cases are not yet widespread in this industry. The survey participants recognize a potential for 3D visualizations in the metaverse that extends over the entire life cycle of buildings. An exemplary scenario for this is the use of 3D visualizations both during the planning phase and in marketing. The challenges identified shed light on uncertainties relating to data protection, privacy, and the influence of the internet. The results of the study reveal a high level of uncertainty and ignorance within the industry when it comes to understanding the metaverse. Based on the results, further studies are needed to establish the understanding and real potential of the metaverse in the industry. Conducting workshops specifically aimed at the AECO sector can help to deepen understanding of the potential of possible use cases.

Article highlights

Analysis of potentials and challenges of metaverse in architecture, engineering, construction, and operations (AECO)

Methodology comprises literature review, workshops and a representative survey

Results show a high level of uncertainty and ignorance of the construction industry regarding the metaverse

Explore related subjects

  • Artificial Intelligence
  • Medical Ethics

Avoid common mistakes on your manuscript.

1 Introduction

The onset of the twenty-first century heralded an era of unprecedented digital transformation across various industries. Among these, the construction industry, a vital component of global economic infrastructure, stands on the brink of a potential change with the integration of metaverse technologies in architecture, engineering, construction, and operations (AECO). This paper aims to explore the burgeoning role of the metaverse in AECO, analyzing the attitude of the construction industry towards its impact on enhancing digital capabilities and reshaping industry dynamics.

The concept of the metaverse, a collective virtual shared space created by the convergence of virtually enhanced physical reality and persistent virtual spaces, offers untapped potential for the AECO sector. Its implications are vast, ranging from virtual design and collaboration to enhanced project management and operations, but also real estate marketing. The integration of these virtual elements into the physical processes of construction signifies a paradigm shift in how industry professionals envision, design, and execute their projects. However, despite the promise of these innovations, the construction industry has faced challenges in adopting digital transformation at a pace comparable to other sectors. A pivotal study by [ 1 ] highlighted the slow adoption of digital technologies in the construction industry, while noting significant advancements in sustainability practices. This dichotomy underscores the need for a balanced approach that leverages technological advancements while addressing the industry's unique challenges and opportunities. Furthermore, the European Commission's 2021 analytical report on digitization in the construction sector provides a comprehensive overview of the current state of digital transformation in this field [ 2 ]. It underscores the critical need for integrating advanced technologies like the metaverse to stay competitive and efficient. Various countries already have metaverse strategies. In the European Union, such a strategy is currently discussed.

Analyses by PwC and the European Commission underscore the urgent need to advance the digital transformation in the construction and real estate industry. These publications identify digitization as a crucial lever for future success and value creation. The metaverse represents just one aspect of the digital transformation in the AECO sector that will be examined in this paper. The lack of comprehensive and representative studies on the integration of the metaverse into the AECO sector highlights the relevance and urgency of conducting initial scientific investigations in this area. These studies are essential for gaining a better understanding of the opportunities and challenges, thereby creating greater transparency. Furthermore, there are currently neither standardized definitions of the metaverse nor specific definitions for its application in the AECO sector. This results in inconsistent understanding and application within the industry. [ 3 ] Given the current state, where the metaverse is scarcely integrated into the AECO sector and no standardized definitions exist, the central research question of this paper is formulated as follows: What challenges and opportunities, as well as specific use cases, arise from the integration of the metaverse to advance the development and progress of the AECO sector? The problem, therefore, lies in the lack of knowledge and standards within the industry on this topic. The aim of this paper is to establish a comprehensive foundation for the integration of the metaverse into the real estate sector, familiarize stakeholders with this topic, and lay the groundwork for further in-depth studies. This includes considering all lifecycle phases of a building and identifying key issues such as challenges, opportunities, use cases, and future research needs. In order to approach this topic, a literature analysis is chosen at the beginning to explain basic concepts and provide theoretical insights into the current situation of the metaverse. Workshops will then be held with selected stakeholders in order to gain an initial opinion directly from the industry. Building on this, an online survey will be conducted in order to comprehensively record the content of the literature analysis and the results of the workshops for the broad mass of the industry. In this way, initial sentiments of the industry as well as opportunities and challenges can be presented.

The remainder of this paper is organized as follows: First, we are going to present the state of the art of digital transformation in AECO covering relevant digital technologies exemplarily, introduce several definitions of the metaverse, and cover especially regarding extended reality (XR) technologies in the construction industry. We are then going to describe our methodology, incorporating expert workshops and various use cases. These considerations led us to a list of challenges, that we incorporated into an online questionnaire answered by 291 participants. After a discussion of our key findings, we are going to provide a conclusion and an outlook.

2 State of the art

Several current research areas are relevant to this article. First, we give a general overview of AECO and digitization before going deeper into the aspects of Building Information Modeling (BIM), XR technology and the applications of metaverse in the AECO sector. Finally, we present some examples of use cases for metaverse in the AECO sector.

2.1 AECO and digitization

In the AECO sector, the digital transformation and digitization actually disrupts the way, the projects are executed by the various stakeholders throughout the lifecycle [ 4 , 5 , 6 ]. Therefore and to maintain the growth of the AECO sector, the adaption of digital tools as well as associated methods and processual, organizational aspects and change management are vital [ 7 , 8 ]. The AECO sector thereby uses and combines various technologies, that aim especially to improve sustainability (e.g. by optimizing the building performance or reducing material and waste) and the cross-lifecycle data exchange (e.g. transferring as-built-data to the facility management and material passports) as well as support the efficiency in the design and construction process (e.g. avoid time delays, cost increases and safety risks) [ 9 , 10 , 11 , 12 , 13 ].

There are multiple relevant digital technologies for the digital shift in the AECO sector, namely Building Information Modeling, robotics and automation, big data, artificial intelligence, IoT, and XR. In this section, BIM is used as an example to show the challenges that occurred during the implementation of the technologies, to find derivations for the challenges for the implementation of the metaverse in the AECO sector.

Building Information Modeling is a systematic method that enables the comprehensive exchange of data and information through the targeted use of a variety of technologies and processes, resulting in a digital building model during planning and construction [ 14 ]. Special data exchange formats are used to seamlessly integrate data from different disciplines, such as architecture and technical building services. [ 15 , 16 ] This data, which is generated from building planning through to utilization, is represented in a digital, three-dimensional building model [ 17 ]. BIM models are used in the areas of design coordination, construction planning and facility management, as they contain comprehensive information on the design, construction, and operation of a building [ 18 ]. It represents a life cycle model for real estate [ 19 ]. Semantic information about the building context is provided, covering aspects such as geometry and properties of facilities, spatial relationships, proximity, and connectedness [ 20 ]. The digital building model derived from BIM has the potential to evolve into a real-time information model by integrating measurements and metadata from IoT (Internet of Things) sensors [ 21 , 22 ]. The AECO industry recognizes BIM as a significant opportunity and sees it as the key to digitization and achieving technological leadership in the industry [ 23 ]. The application of the BIM methodology is undergoing a revolutionary development through experience-oriented communication with numerous sensory perceptions [ 24 ].

The challenges and limitations of Building Information Modeling are one of the key research issues. In recent years, especially the challenges for emerging topics in the field of BIM, as the combination of BIM with AI (e.g., large language models like ChatGPT), Lifecycle Assessment or augmented reality, were researched [ 25 , 26 , 27 , 28 , 29 , 30 ]. By conducting a literature review, five aspects of challenges could be indicated:

Skills and educational barriers—especially in the early years of BIM a lack of education and training in BIM could be observed, what lead to a lack of skilled BIM users and authors at the beginning [ 31 ]. Also the courses at the university increased in recent years, the lack of skilled users and authors as well as developers is still a relevant key success criterion for the implementation of BIM [ 32 ]. Especially the combination of practical and theoretical education of students is one main challenge, due to highly volatile software landscape and influences on the BIM education [ 33 ].

Technology barriers—in current practice, proprietary data exchange formats are missing. Although, there are vendor-neutral standards (such as IFC or COBie), the software vendors do not implement them neutrally [ 34 , 35 ]. In addition to that there exist licensing problems, so that some stakeholders of the AECO industry need various software products to work in different projects (e.g., in project A software X is needed, while in project B software Y is needed, due to the requirements of the owner, general planner or general contractor) [ 36 , 37 ]

Financial barriers—the implementation of BIM often causes high implementation costs, which is stated as one barrier for the implementation [ 38 ]. On one hand, these implementation costs are caused by the purchase of soft- and hardware, as IT-tools (e.g. authoring tools, the implementation of Common Data Environments or cloud services) [ 39 ]. On the other hand, the employees need to be trained in BIM, due to a lack of knowledge [ 39 ]. In addition to that, after implementation high ongoing investments are necessary to provide the BIM infrastructure [ 40 ]. Therefore, it is necessary to keep the focus on the ROI, which could be made in a short number of years, but various factors, as the real productivity improvement, intangible returning factors and the lack of industry standards, need to be taken into account [ 41 , 42 ].

Legal barriers—still the legal rules for the implementation of BIM in the AECO industry (e.g., ownership of the model, liability for software and operator errors) are in its infancy and need further refinement. Especially the handling of contractual aspects (such as Employers Information Requirements or BIM Execution Plans) are still not fully implemented and unclear [ 43 ], although there already exist rules and standards, others still being developed and installed.

The example of BIM shows, that in the AECO industry there is a traditional resistant to change existing structures and social aspects are more important, then in other industries [ 44 , 45 ].

2.2 Applications of the metaverse

2.2.1 origin of the metaverse.

The origins of the term “metaverse” can be traced back to 1992, when American science fiction author Neal Stephenson published his work “Snow Crash.” [ 46 ] In this literary creation, the author designed a virtual reality that enabled parallel interplay and interactive experiences with the physical world. This novel utopian concept explored social emotions and sought to transcend the boundaries of time. [ 47 ] The metaverse which is discussed in this paper is currently in the development phase [ 48 , 49 ]. For this reason, the exact definition of the metaverse has not yet been clarified in scientific research [ 50 ]. The term metaverse is made up of the components “meta”, which comes from the Greek and means a kind of “transcendence”, and the English word “universe” [ 51 ]. In the discipline of computer science, continuous innovations register significant added value for users in terms of their interaction and action possibilities in the context of digital and virtual worlds [ 52 ]. However, there is currently a lack of entities and transparent standards to guide the design of the metaverse. Systematic approaches to data protection and private sites are still needed. More intense competition is underway between companies as they strive to develop closed and proprietary hardware and software solutions to establish themselves as leading players in the metaverse. [ 52 ] The lack of standards is also reflected in the fact that there is no universally valid and clear definition of the metaverse yet. Due to this fact, there is no single metaverse. Instead of this, there are different platforms and technologies. Each of these can be seen as a separate metaverse. [ 53 ] Examples include Decentraland, Second Life, Minecraft, Roblox and the Sandbox [ 54 ].

2.2.2 Metaverse concepts

In 2020, the global Covid-19 pandemic led to significant changes in our society. They presented us with new challenges that required innovative solutions. The introduction of new working models such as remote work acquired unprecedented importance in society at this time. At the same time, the metaverse and certain technological innovations came into greater focus. The metaverse showed its potential to bridge the divide between working from home and working in the office. From an industrial innovation perspective, the metaverse has the potential to overcome the established physical norms to which people have become accustomed. It can stimulate the development of industrial technologies in a revolutionary way, promoting widespread integration of different sectors of the economy. This can accelerate change and modernization in relevant industries through the introduction of new concepts and formats. The metaverse is often referred to as the coming evolutionary stage of the Internet. [ 47 ]

The metaverse represents an immersive digital environment where users can explore the three-dimensional internet, serving as a link between our physical reality and the virtual sphere [ 55 ]. Components of the metaverse can be individualized avatars. This allows users to move freely in the metaverse and pursue various activities. [ 48 , 56 ]

According to Radoff [ 57 ], the metaverse concept consists of seven layers. The initial level refers to the "experience". This refers to the dematerialization of concrete places. Examples of this are events such as concerts, live events, and travel, which can be experienced within the metaverse. The second layer focuses on the aspect of "Discovery". The focus here is on determining real-time information and status messages. The third layer in the metaverse focuses on the "creator economy". Innovators, engineers, and creatives can benefit from the resources of the metaverse and enrich their activities as a result. The metaverse can be seen as a virtual trading platform on which these players can exchange their works and services. The fourth layer of the metaverse focuses on "spatial computing". Within this digital environment, a spatial context is created that grants the user access and enables them to interact and act. This spatial context can lead to a three-dimensional environment using XR technologies. The fifth layer of the metaverse focuses on "decentralization". The structure of this digital platform is characterized by a decentralized design, which means that no single person exercises unlimited control over it. Users can conduct experiments and promote their growth to an extent of their choice. The sixth layer in the metaverse comprises the "human interface". In the current era, mobile devices are increasingly focused on powerful features to integrate the applications and experiences of the metaverse. In the future, we can expect to see smart glasses on the market that can perform all the functions of a cell phone as well as applications for augmented reality (AR) and virtual reality (VR). The last layer in the metaverse, the "infrastructure", forms the backbone that connects our devices to the network and enables the transfer of content. A significant improvement in bandwidth and a drastic reduction in network latency can be expected as 5G networks come to the fore. In the next phase, 6G is expected to significantly increase speed again. This concept results in a so-called metaverse value chain. [ 57 ]

In the publication by Buchholz et al., seven characteristic features of the metaverse were identified. These seven characteristics include: the fusion of real and virtual worlds, the design as a social medium in which people can communicate, trade and own property, the persistent and long-lasting existence of the metaverse, the integration of extended reality technologies into the system, key actions such as recording the state of users and their real environment, multimodality and a close connection with physical reality. [ 55 ]

Similarities to this definition can be seen in the concept of the metaverse formulated by Steve Benford. He coined five characteristic features for the metaverse, which are as follows: The metaverse is understood as a virtual world, the relevance of virtual reality technologies is essential for its realization, it functions as an independent social network within which interaction takes place by means of individual avatars, the metaverse is characterized by its permanence and is closely linked to our real world. [ 58 ]

In terms of application areas, the metaverse will continuously support the advancement of technology and industrial maturity. The metaverse has the potential to fundamentally transform the economy, education, public service, and social interaction sectors. [ 52 ] Of course, there are various risks associated with the metaverse, including data protection aspects, ethical issues such as fairness considerations and the user-friendliness of the user interface [ 48 ]. In particular, the handling of highly sensitive data underlines the essential importance of a reliable and precise data basis [ 59 ]. The subject of this paper is the following precise definition of the metaverse: The metaverse forms a link between physical reality and a virtual world in which extended reality technologies enable the creation of immersive experiences. Due to its permanence, the metaverse offers society the prospect of making work, trade, and communication activities independent of time and space. The imagination experiences boundless expansion within the metaverse, as it extends into an infinite expanse unrestricted by the physical limitations of reality. In contrast to the transience of the physical world, the digital counterpart of the metaverse exhibits remarkable persistence and longevity. However, it is important to emphasize that the metaverse is more than a simple replica of reality. It has its own characteristic features, which manifest themselves particularly in the personal level of interaction. There is a clear difference between the two worlds: The metaverse struggles to fully replicate the subtleties of human experience, from the tactile sensations of touch to the complex nuances of human emotion. Interpersonal relationships in this virtual sphere often pale in comparison to their real-life counterparts. Section  2 .3 addresses the numerous metaverse applications for the AECO industry and thus the challenges and opportunities.

2.2.3 XR technology

The developments in extended reality, also known as XR, aim to make the metaverse accessible and tangible for users. The technology influences how people perceive their sensory impressions. Extended reality includes technologies such as augmented reality (AR), virtual reality (VR) and mixed reality (MR), each of which enables immersive experiences and describes the spectrum between reality and virtuality [ 60 ]. The experience factor is intensified using AR or VR glasses. Nevertheless, the metaverse can also be explored using conventional devices such as smartphones or laptops [ 61 ]. XR systems allow interactive user participation in virtual content through motion control. These motion controls are in the form of hand-held devices equipped with grips, buttons, triggers, and thumb drives. This allows users to touch, manipulate, operate, and grasp virtual objects [ 62 ]. Users do not need to be stationary when using these technologies. They use the whole body during the application. The transmission of physical movements in XR environment is done by tracking positions and rotations [ 63 ].

Table 1 explains the individual terms used under XR technology.

2.3 Metaverse application AECO

In a comprehensive analysis, McKinsey ascertained that ''95% of business leader expect a positive impact on their industry within 5 to 10 years'' [ 70 ]. This results in a significant impact of metaverse applications on various aspects of daily life, both in private and professional contexts. As a result, McKinsey calls on companies to actively pursue the development of the metaverse business. [ 70 ]

The integration of the metaverse offers numerous application possibilities in the architecture, engineering, construction, and operations (AECO) sector [ 24 ]. Nevertheless, the practical application of the metaverse in the AECO sector or in the BIM methodology has not yet been proven. This implies the need for further research into potential fields of application and the integration of the metaverse into the life cycle of buildings [ 24 ].

Metaverse applications can be an enrichment for customers, particularly in architecture. Through virtual tours, customers can gain realistic impressions of the property that is yet to be realized. [ 71 ] The metaverse options extend across various platforms where users can interact, trade, market their products and offer services. Additionally, it enables virtual meetings by customized 3D avatars. For instance, corporates may strategically devise virtual assets within the metaverse for promotional purposes. [ 72 ] Well-known platforms for this include The Sandbox and Decentraland [ 73 ].

Nieradka has found out that VR technology can have a positive impact on real estate sales methods by enabling multi-sensory presentations. On the one hand, this leads to an increase in user-friendliness and innovative ability, and, on the other hand, it influences the personal feeling towards the property. By strengthening imagination, this can lead to more emotional perception. [ 74 ] A study by Brenner analyzed the impact of VR application on potential property buyers. The results suggest that the metaverse has the potential to significantly change the processes by VR technologies. Such a development can lead to an increase in property sales. [ 75 ] In the real estate marketing sphere, the supportive use of VR technology can create a unique experience for buyers and agents by simulating properties in different geographical locations [ 76 ].

By using the metaverse, various scenarios of real situations can be simulated, whereby the data generated can help to assess and understand the feasibility and sustainability of the physical property more accurately [ 77 ]. This gives the user the opportunity to relive digital experiences regarding certain events, allowing precise recommendations for action to be derived for the real world [ 78 , 79 ].

The existing situation illustrates a current underrepresentation of verifiable studies that analyze the interaction between the metaverse and the construction industry in depth [ 3 ]. However, there are already initial attempts at VR applications in the construction industry [ 80 ]. Key indicators of success in construction management are manifested through the quality achieved, time management and cost management. In his recently published paper, Oz highlights that construction management can reap significant benefits from the integration of the metaverse. This extends to multi-faceted areas such as design planning, work planning, risk management and resource management. [ 3 ] The metaverse enhances the quality of decision-making processes, facilitates operational workflows, and promotes a sense of teamwork by removing barriers to communication [ 81 ].Throughout the entire construction project cycle, the metaverse provides valuable tools for monitoring and maintaining projects. The integration of data from sensors into the metaverse enables continuous monitoring of construction progress, equipment usage and resource allocation in real time. This enables the parties involved to identify potential bottlenecks, monitor the progress of project milestones, and make necessary adjustments at an early stage to ensure a successful project. [ 82 ]

In addition, design and visualization skills are proving to be fundamental to success in the construction industry in the metaverse. This provides immersive access to advanced 3D modelling and visualization tools. Through these tools, stakeholders can create, adapt, and visualize designs in a virtual environment, enabling both realistic and immersive representation. [ 81 , 83 , 84 ] Visualizing designs in the metaverse makes it easier to understand and evaluate design decisions. Defects and conflicts can be identified more efficiently, which leads to improved interdisciplinary communication and minimizes the risk of potentially costly design errors. [ 81 , 85 , 86 ] Furthermore, the metaverse facilitates the external management of assets and enables facility managers to monitor and maintain buildings in an efficient manner. This helps to reduce operating costs and extend the life of constructed assets. [ 81 ] It is also possible for individuals to acquire digital real estate and rent it out in a yield-oriented approach. This practice opens prospects for investment. A development that can already be observed is currently manifesting itself, with various events such as exhibitions, music festivals and concerts taking place in the metaverse. [ 56 , 78 ] Successful establishment of the metaverse in the real estate and construction industry requires acceptance on the part of interested parties, with a positive user experience being an essential concomitant. [ 87 , 88 ]

The key components of the metaverse's success in the construction industry undoubtedly lie in its ability to scale and adapt. An essential aspect is that the metaverse can efficiently support construction projects of varying size and complexity. The scalability of the metaverse platform is central to this, as it should enable the seamless integration of multiple projects and keep pace with the growth of data volumes as projects progress. This ensures that the platform is flexible enough to meet the diverse requirements of the construction industry. [ 81 , 89 , 90 ] Through an adequate adaptability of the platform, it becomes possible to flexibly respond to evolving project requirements, thereby enabling the integration of various new technologies and functionalities. The assurance of adaptability and scalability thus forms the foundation for the long-term performance of the metaverse within the construction industry. [ 91 ] The integration of the metaverse into the construction industry requires meticulous consideration of regulatory and legal aspects. To ensure compliance and protect the interests of stakeholders, it is essential to establish a precise definition of data protection. [ 81 , 92 , 93 ] A solid technological foundation is also essential for the successful implementation of the metaverse. This includes a powerful internet connection, hardware that supports virtual and augmented reality applications and scalable cloud computing resources. Only with this foundation can smooth and continuous interaction in the metaverse with all project participants be guaranteed. [ 94 ] The realization of effective communication and collaborative cooperation in the metaverse for the real estate and construction industry is only possible under certain conditions. The metaverse gives involved parties the ability to participate in real-time conversations, convey project messages and share information seamlessly. Virtual conferencing, shared workspaces and instant messaging facilitate effective communication and strengthen collaboration between team members, regardless of their physical presence. [ 95 , 96 ]

3 Methodology

To evaluate the possible effects, approaches as well as advantages and disadvantages for the AECO industry, this study uses a multi-step approach. This approach is shown in Fig.  1 . The approach allows a quantitative and qualitative survey. It was chosen as methodology, because it enables an analysis and comparison of early adopters in qualitative workshops and the perspectives of the masses in the AECO industry.

figure 1

Methodology of this paper

In step 1 a literature review was conducted. In this literature review, the key words “metaverse”, “construction”, “facility Management”, “Architecture”, “Digitization”, “XR” and “building” were searched in various combinations in abstracts, headings and texts. Hereby scopus, google scholar, ScienceDirect and Springerlink were used (see Sect.  2 ). A comprehensive literature analysis was conducted, considering over 90 scientific sources. The focus was on analyzing recent research findings, primarily from publications dated 2019 or later. Predominantly, English-language sources were used to ensure an international perspective and access to the latest developments and discussions in the research field. The literature review encompassed a variety of renowned journals that significantly contribute to the scientific discourse in the fields of construction and information technology. Notable journals considered include Journal of Information Technology in Construction, International Journal of Construction Management, Journal of Civil Engineering and Construction, Journal of Building Engineering and International Journal of Information Management.

In step 2 workshops were conducted with various stakeholder. The workshops aimed to find out challenges and opportunities of the implementation in the AECO industry. Furthermore, they aimed to evaluate a status quo in the practical application of the metaverse. All in all, 4 workshops were conducted between November 2022 and January 2023. The workshops were held with project developers, Planner, Constructors and Facility Management companies, whereas respectively 9 experts took part in the workshops. The experts were selected based on their experience with the metaverse as well as due to plans in the implementation of a metaverse strategy. This should ensure that experts who are pioneers in the implementation of the metaverse are included, as well as companies that have so far only dealt with the implementation of a metaverse strategy in theory. These were moderated workshops in which different perspectives on challenges and opportunities were discussed. Based on this a further analysis of literature and use cases was done to validate and compare the results of the literature review and the practical experiences (see Sect.  4.1 ).

In step 3 a quantitively online survey was conducted. The questions were elaborated between July 2023 and September 2023. The content of the survey questionnaire is based both on the findings of the literature analysis and the results of the workshops. It serves to capture a further picture of the mood of the stakeholders in the AECO sector. The literature analysis and the workshops have produced initial findings that shed light on potential challenges such as interoperability and data protection. At the same time, initial use cases were identified, including the possibility of using the metaverse as a new communication tool that bridges the gap between the real and virtual worlds. In particular, the literature shows potential in the planning phase by using XR technology and individual avatars to make the planning design more vivid and to use simulations of real events to minimize planning errors. It is of interest to find out whether the industry players also recognize these potentials and whether they perceive the identified challenges in a similar way as described in the literature and the workshops. (see Sect.  2 ) The results of this survey will help to shape further steps towards metaverse integration and identify the need for additional areas of research.

At the beginning of September 2023, the online survey was checked by 16 pre-testers from various areas of expertise of the AECO industry, such as project developer, constructors, facility managers or building informaticians. Based on this, the online survey was revised and optimized. In particular, the formulations were specified in order to achieve an exact result. The online survey was activated on 16 October 2023. In order to receive as many answers as possible, the link was shared on LinkedIn, via mail, via Newsletter (e.g., Fraunhofer or Mittelstand Digital Zentrum Bau) and in various companies and courses. The questionnaire was written in German and English. All in all, 291 people participated in the survey, all of them were respondents from Germany. The total number of possible stakeholders in the German AECO industry is 3,500,000 stakeholders in the AECO industry [ 97 ]. With a margin of error of 5% and a confidence level of 90%, this results in a sample size of 273, which means that the number of participants (291) exceeds the required number of respondents and is representative (see Sect.  4.2 ).

4.1 Workshops

The 4 workshops were conducted between November 2022 and January 2023. The workshops were moderated by one moderator, who designed, executed, and evaluated the workshops. In the workshop itself, various brainstorming methods were used to evaluate potential use cases and the associated challenges and opportunities. The results were developed and documented in individual brainstorming phases, sharing phases and in joint group discussions. All participants of the workshops were early adopters, that are using metaverse technologies in their daily business or are planning to do so. This aimed to evaluate the challenges and opportunities, that may arise or arose by implementing the metaverse in the AECO industry.

The workshops were based on the aforementioned literature analysis as well as reports of proof-of-concepts of other industries, such as mechanical engineering, medicine, or teaching. Furthermore, the experiences of the authors were integrated in the preparation of the workshops. The workshops were semi-structured and lasted approx. four hours. Regularly 2–4 persons (without the moderator) participated in the workshops.

4.1.1 Evaluation of use cases

Firstly, the workshops aimed to evaluate possible applications of the metaverse for the AECO industry and validate possible applications, that were mentioned in the literature (see Sect.  2.3 ). Due to the participants of the workshops, the use cases are divided in use cases for Real estate development and marketing, Design, Construction Management and Facility Management. As shown in Table  2 , various possible use cases for the application of the metaverse in the AECO industry were evaluated in the workshops.

In Table  2 it can be seen, that the four stakeholder groups associate different use cases with the implementation of the metaverse in the AECO industry. While Real Estate Developers see use cases especially in the fields of marketing and new business models, Designers focus information use and organizational aspects. For the Construction Management especially better communication by using immersive technologies was mentioned. The Facility Managements expects a better data management and storage by implementing the metaverse in FM.

4.1.2 Requirements and foundations for implementing the metaverse in AECO industry

Beside the definition of possible use cases for the stakeholders (for the summary of the use cases see Sect.  4.1.1 ), especially technical, social and economic barriers and opportunities were discussed. Especially the following aspects were mentioned in the workshops:

The metaverse offers new possibilities and business models. All participants stated, that the metaverse could improve their actual business model. Especially the generation of new business areas (e.g., renting and developing spaces in the metaverse), supporting current processes (e.g., immersive construction or planning meetings) or developing new, digital processes (e.g., integrating ne spaces of communication), were mentioned.

The metaverse is more than the use of VR or AR. The workshop participants stated that most of their partners and customers believe, that the metaverse is the same as the use of VR or AR glasses. Therefore, they emphasize the development of one uniformed and accepted definition of the metaverse for AECO industry.

The metaverse is a new approach of designing, constructing and operating buildings. The social aspect must not be neglected, when integrating the metaverse in the AECO industry. Especially due to the digital change of the industry (e.g., Building Information Modeling or Artificial Intelligence) a new digital tool can lead to a defensive attitude on the part of those involved. Therefore, it is necessary to educate and train the stakeholders of the AECO industry about the aims and contents of the metaverse.

The metaverse contributes to the need to integrate even more other disciplines into the industry. This is especially due to new needed programming skills, new skills for the provision and operation of hardware and a necessary change management to implement the metaverse in the AECO industry.

The metaverse is actually not a trending topic in the AECO industry. Despite the early adopters, that took part in the workshop, the participants stated, that the metaverse is not well known in the AECO industry and no relevant standard.

These aspects show that various use cases (see Sect.  2 .3 ) could be generated. On the other hand, the AECO industry is conservative, so that the implementation of new technologies or tools need a lot of time. Therefore, the participants stated, that only few use cases should be implemented and researched at the beginning. From that point, the metaverse could gain more attraction to become a relevant state of the art in the AECO industry.

The results of the workshops (use cases, challenges, opportunities) and the literature analysis were incorporated into the quantitative survey. The following sections show the results of this survey. The questionnaire can be found in the “ Appendix ”.

4.2.1 Basics

In total, 291 participants answered the questionnaire. It has to be said that all the questions were optional, for that reason not all responses are complete for every question. Regarding the age of our participants (cf. Fig.  2 ), more than half of them are between 16 and 29 years (68.49%; n = 150). Followed by 30–40 years and 41–50 years (each 12.79%; n = 28). Nine participants were in the age group of 51–60 years (4.11%). Three participants answered to belong to the age group of 61–70 years (1.37%). No respondent belongs to the age group of 71–80 years (0%), while one participant did not answer the question (0.46%).

figure 2

Age structure of the online survey

We then asked the participants whether they are student or professional, followed by the specific question on the degree program they are enrolled in respective their role in the real estate industry. We can observe a balanced picture regarding students and professionals (cf. Fig.  3 ): There were 140 students (48.11%) and 150 professionals (51.55%) answering the questionnaire, one not responding to this question (0.34%). The students were mostly from civil engineering (68.57%, n = 96), followed by Green Building Engineering (16.43%, n = 23), industrial engineering (8.57%, n = 12), digitization management and management (0.71%, n = 1 each). 6 participants did not provide an answer to that question (4.29%).

figure 3

Students and professionals answering the questionnaire

For the professionals, the role in the industry is more widespread, as shown in Fig.  4 . With 5.84%, most of the respondents self-assessed as civil engineer (n = 17), followed by architects and consulting (n = 11, 3.78% each), planners of technical building equipment and facility management (n = 10, 3.44% each). All other roles were mentioned. Additionally, eight participants chose “other” and complemented “BIM Manager”, “IT” (two times), “education”, “professor”, “measurement”, and “public client”.

figure 4

Self-assessment of roles of the professionals responding to the questionnaire in the real estate industry

When asked how confident the participants are regarding the usage of the term “metaverse” on a Likert scale with 5 being very confident and 1 not confident at all, a majority (55.56%, n = 120) is unconfident or rather unconfident (1: n = 53, 24.54%; 2: n = 67, 31.02%, cf. Fig.  5 ). 26.85% (n = 58) seem to be not sure and selected a value of three. Only 17.59% (n = 38) participants felt confident or very confident (4: n = 33, 15.28%; 5: n = 5, 2.31%).

figure 5

Self-assessment for the usage of the term metaverse

Based on the definition of Buchholz et al. (2022), we asked our participants to rate the seven characteristics described in the paper [ 55 ]. Figure  6 visualizes the average values captured in our questionnaire. In average, the most mentioned characteristic was the multimodality (average 4.03, SD = 0.99). Second, the characterization as an integrated system integrating XR and other technologies is mentioned (avg. 3.84, SD = 1.04). Then, participants mentioned the combination of virtual and augmented real worlds (avg. 3.75, SD = 1.0), directly followed by the social medium aspects with interaction, communication, collaboration, and property ownership (avg. 3.72, SD = 1.05). Next, the participants agreed with an average of 3.3 that “capturing the state of the user and the real environment is a key action for metaverse applications.” (SD = 1.15), followed by a persistent and long-lasting state (avg. 3.04, SD = 1.25). The least mentioned characteristic is the close coupling with reality with an average of 2.9 (SD = 1.17).

figure 6

Rating of the characteristics of the metaverse on a Likert scale by the participants (average values)

Finally, we also asked whether the participants see the metaverse as a new communication basis in the construction and real estate industry. As shown in Fig.  7 , a majority of 64.38% (n = 141) agreed, 23.29% (n = 51) disagreed, 6.85% (n = 15) chose “other” and expressed mostly, that it is only partially, 5.48% (n = 12) gave no answer.

figure 7

The metaverse is seen as a new communication basis in the construction and real estate industry by a majority of the participants

4.2.2 Influence of the metaverse on the buildings’ lifecycle

One main goal of our survey was to identify, which phase in a buildings’ lifecycle could benefit most from metaverse technology usage (cf. Fig.  8 ). Hence, we asked the participants, which phase they think will be the most changed by the metaverse. Here, the participants mentioned the marketing phase on the first place (avg. 4.12, SD = 1.1), followed by the planning phase (avg. 3.93, SD = 1.03). During initiation or project planning, an average value of 3.61 (SD = 1.13) was recorded. The redevelopment phase was mentioned with an average value of 3.05 (SD = 1.13). The two least rated phases were the demolition phase (avg. 2.83, SD = 1.12) and surprisingly the construction phase (avg. 2.68, SD = 1.2).

figure 8

Building lifecycle phases that may benefit from the metaverse (average values)

With the following questions, we dig deeper into the specific phases, starting with the planning phase depicted in Fig.  9 . In the planning phase, the highest rated application according to our survey is the virtual property inspection using XR technologies (avg. 4.15, SD = 1.09). The collaborative cooperation of all stakeholders (avg. 3.8, SD = 1.1) is directly followed by the digital real estate planning to support decision making (avg. 3.75, SD = 1.07) and the simulation for training (avg. 3.71, SD = 1.09). Integrating cubatures into the landscape (avg. 3.4, SD = 1.12) was averagely rated higher than the digital purchase and sale of real estate in the metaverse (avg. 3.37, SD = 1.25) and the support with digital building applications and coordination with approval authorities (avg. 2.95, SD = 1.31).

figure 9

Relevant application areas especially for the planning phase (average values)

When looking at the average values in the construction phase (cf. Fig.  10 ), the variance is not high: All values are between 3.36 and 3.7, except for one application area. The tracking of construction defects received the lowest rating in our study (avg. 3.0, SD = 1.26). In contrast, virtual and collaborative construction project planning and site meetings received overall the highest score (avg. 3.7, SD = 1.15). In descending order the participants rated the usage of metaverse tools for building structure creation and checking and real time designs (avg. 3.65, SD = 1.04), the simulation of construction processes, hazard and interface analysis (avg. 3.6, SD = 1.16), the coordination of resource requirements by simulation and digital construction project planning (avg. 3.58, SD = 1.07), the coordination of logistics (avg. 3.44, SD = 1.08), safety training on the construction site (avg. 3.42, SD = 1.31), and the construction monitoring and actual state recordings using drones with live transmission to the metaverse (avg. 3.36, SD = 1.21).

figure 10

Relevant application areas especially for the construction phase (average values)

Another phase that might be relevant is the operation phase of a building. Hence, we asked the participants to rate several application areas here as well (cf. Fig.  11 ). With a value of 3.69 (SD = 1.06), building automation control was rated highest in average, followed by the explanation of safety measures or building instructions for users/occupants (avg. 3.61, SD = 1.06). The metaverse as an interaction community for residents (avg. 3.41, SD = 1.19) was slightly above indoor navigation (avg. 3.35, SD = 1.21), and virtual facility management (avg. 3.31, SD = 1.17). Least rated was the simulation of media flows (avg. 3.11, SD = 1.11).

figure 11

Relevant application areas especially for the operating phase (average values)

As seen above, the participants saw great potential for metaverse usage regarding marketing. Asked about specific application areas, the results show that virtual property viewings to market the property are rated highest (avg. 4.12, SD = 1.04), followed by consultation and sales talks in the metaverse (avg. 3.55, SD = 1.26), and due diligence (avg. 3.09, SD = 1.09).

For the recovery of buildings in the demolition phase, we asked specifically if the digital twin could be relevant to support the recycling and recovery process of used substances and materials (avg. 3.26, SD = 1.09), or if the coordination of the demolition concept may be supported by metaverse technologies (avg. 3.29, SD = 1.13). Both values appear to be rather in midfield, reflecting the rather low value for benefits in the demolition phase expressed in Fig.  8 .

The results of both the marketing and the demolition phase application area ratings are summarized in Table  3 .

4.2.3 Challenges and opportunities of the metaverse

With our questionnaire, we covered also challenges in relation to the metaverse (cf. Fig.  12 ). As this was a multiple-choice question, participants were able to give multiple answers. 171 times the answer data protection and privacy was selected, hence landing on the first place. Then, the participants mentioned the dependence on the digital world (n = 143) and insufficient hardware equipment (n = 125). High implementation costs only are on the fourth place with 123 mentions. On the other end of the spectrum, the risk of a monopoly position of a provider (n = 68), too much fusion of digital and real properties (n = 67), and digital currencies (n = 64) are not seen as main challenges. For the “other” option, we provided again the possibility to enter free text, which nine participants did, mentioning amongst others “entry barriers”, “immature technology”, “communication problems and interpersonal understanding”, or the creation of a “potential digital parallel world, which should not be the main focus due to real tasks” (0.34%, n = 1 each).

figure 12

Challenges in relation to the metaverse (n = 291)

Regarding the opportunities of metaverse usage in the real estate industry (cf. Figure  13 ), the improved understanding through enhanced visualization (avg. 4.0, SD = 1.04) received the highest rate, followed by the simplification of decision making with 3D visualizations in the metaverse (avg. 3.95, SD = 0.98). Communication independent of time and place (avg. 3.73, SD = 1.15), and the expanded communication basis of all actors concerned (avg. 3.65, SD = 1.05) join next. Other opportunities are the increased efficiency of process flows within the lifecycle phases (avg. 3.64, SD = 0.97), the flexibilization during the lifecycle (avg. 3.4, SD = 1.08), open communication structures (avg. 3.34, SD = 1.16), and lastly an increased sense of reality (avg. 3.22, SD = 1.34).

figure 13

Opportunities of a real estate metaverse

Taking a closer look to the different functions of the professional participants, one can see the distribution shown in Fig.  14 regarding the opportunities rating of a real estate metaverse. It must be said that for investment and property management, only one participant responded the survey, and for project development, only two participants were involved (cf. average values and standard deviations in Table  4 ). Hence, the high values for project management participants, especially for open communication structures and flexibilization in the course of the life cycle might be misleading. The same applies for the “other” category, as there are multiple functions with low participation numbers included (cf. Sect.  4 .2.1 and Fig.  4 ).

figure 14

Opportunities of a real estate metaverse rated by the professional participants

If we in turn consider the age of all participants (students and professionals) and examine the strength rating for different age group, we get the distribution shown in Fig.  15 . One might get the impression that especially older participants rate lower than the younger, but again we only had three participants that identified with the age group of 61–70 years (cf. average values and standard deviations in Table  5 ). And the age groups of 41–50 years get higher results as the age group of 30–40 years. Note that the age group of 16–29 years had lower average values than the 30–40- and 41–50-year-old.

figure 15

Opportunities of a real estate metaverse rated by student and professional participants, divided into age groups

4.2.4 Requirements for the implementation of the metaverse in AECO industry

To implement a metaverse in the AECO industry, some prerequisites on different levels (e.g., technical, or personal) may be required. This forms the last two major questions in our questionnaire. First, we investigate the technical requirements: Here, the average values were above 4.0 for all categories (cf. Fig.  16 ). First, a reliable IT infrastructure is agreed on by most of the participants (avg. 4.44, SD = 0.95). Then, adequate security solutions (avg. 4.26, SD = 0.99), and data format interoperability and interfaces (avg. 4.26, SD = 0.86) share the second place, directly followed by software interoperability and interface (avg. 4.24, SD = 0.86). Next, hardware interoperability and interfaces (avg. 4.21, SD = 0.9) is mentioned. Lastly, the scalability of the technology for growing demand (avg. 4.01, SD = 1.0) is rated on the last place.

figure 16

Technological prerequisites for implementing the metaverse in the lifecycle (average values)

Personal knowledge might also be required when implementing a metaverse within the AECO industry (cf. Table 6 ). Here, participants rated the adaptability for technology changes highest (avg. 4.31, SD = 0.91), followed by the understanding of security risks (avg. 4.01, SD = 0.98), analytical skills for data analysis (avg. 3.88, SD = 0.9), and technical know-how, like programming languages, technology overall, etc. (avg. 3.85, SD = 1.11).

4.2.5 Further remarks and free texts

At the end, we gave our participants the opportunity to add suggestions or comments on the topic of metaverse in the real estate industry. While 208 participants did not provide an answer, and 72 did not see the question or finished the questionnaire before that question, 11 chose to share their thoughts in the free text field. Although these answers are not representative and can only be seen as anecdotal, we want to share a summary of them. The responses were mostly negative. The metaverse was seen as a “technology without a promising future, although there might be interesting tools, but a too high barrier of entry, so that too few people are going to access it”. Another respondent mentions “data manipulation as new botched-up construction”. Another participant doesn’t see the advantage of the metaverse and asks “why setting up additional (and ecological expensive) infrastructure for a virtual environment, if we already have Teams, cloud storage, and open data formats? And why should I do a meeting in a virtual room, when I can do a Teams call instead?”. Another statement emphasizes on the digital transformation in general, stating “the digitization in the construction industry is and remains (initially) a mammoth task”. Similarly, another participant states: “We all have enough to do with the transition to digital building modelling, an additional level of action in the metaverse can only arise—if at all—as an additional, costly marketing channel for property sales. Currently a high-risk application.”. Also, the fact that the questionnaire did not contain an introduction of the term metaverse was criticized: “You should briefly explain, what metaverse means, especially if you want to get a complete overview with the survey, including small and medium enterprises. I did not know the term yet, although the topic is really present for us.”. Another voice mentions the metaverse definition as well: “The definition of the metaverse should be substantiated. I think that Microsoft with Teams and Facebook with virtual ‘phantasy worlds’ are working on fundamentally different goals.”. Regarding the visual aspects one participant states: “In my view, the metaverse is reduced far too much to the visual level. The focus should be much more on the integration of building data and built space. The aim is to create an accessible meta-level for buildings. Based on this, VR access can be created—not the other way round.”.

5 Discussion

Our study contains multiple observations, that we are going to discuss in the following section. Overall, we can identify several limitations and challenges.

5.1 Diminishing hype or yet to arrive?

The anticipated metaverse hype [ 77 ] seems to have subsided, or it has not fully reached the construction and real estate industry, yet. On the one hand, this could be due to the practical nature of the industry, where tangible results and direct applications take precedence over futuristic projections. The lack of visible, immediate benefits may contribute to the metaverse being perceived as a distant future rather than an imminent reality. And there is something to that argument, given that many other issues in digitizing the industry will have to be solved first. This includes issues like data-access, roles, and rights, or simply enough rolled-out and sufficiently powerful mobile devices with fast mobile data communication. On the other hand, this is Amara’s law which states that we tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run. Just like technical drawings in different domains went from 2D ink on paper, over 2D computer-aided design (CAD) to 3D modelling, the AECO industry is following the same trend. Soon, 3D data will be common-place and it will be readily available to users on location at the right time, and in the right quality. Being on location with mobile devices literally allows putting data in the right perspective based on the user’s position and orientation in the environment. Computer applications of the last 6 decades have been mostly presented on 2D screens of different sizes. We have witnessed the advent of mobile computing largely fulfilling the idea of Weiser’s ubiquitous computing, but still they remained on 2D screens, mostly (with pervasive games like Pokémon Go and a few other examples breaking the general rule). Now we are slowly progressing towards spatial computing with mobile devices, be it tablets or head-mounted displays. This will benefit all applications whose data is related to the world and three-dimensional. This future 3D-Internet, a.k.a. “the Metaverse”, will not replace 2D applications, but provide additional possibilities. The three-dimensional data of the AECO industry is an ideal candidate to benefit from this evolution in the long run and provide for interactive 3D data-exploration and modification using XR technology.

5.2 Demographic divide in survey participation

It is notable that predominantly younger individuals have participated in the survey, which could skew the data towards a demographic more inclined to be optimistic about technological innovations. However, our data already show a strong skepticism from the participants. And regarding the opportunities of a potential real estate metaverse, the age group of 16–29 year even had lower average values for all application areas than e.g., 30–40- or 41–50-year-old. Although our study only includes answers from three people identifying as 61–70 years, it can be said that there is no clear trend from our data that older people are more skeptic or rate the strength more critical than younger participants. There is no clear demographic difference regarding the potentials of a real estate metaverse. While younger participants (16–29 years) may in general be considered more open to the metaverse, their rating of potentials appears to be lower than other (older) age groups, potentially leading to an even slower adoption curve (cf. also “Underreported Skepticism” below). It is to expect that the entry barriers should be lower for younger generations, as they are more familiar with new technological devices. But XR technology appears to be not yet widespread enough—also regarding leisure activities and entertainment at home—that younger generations take these technologies for granted and want to employ them in their professional daily routine. One can note that this apparently reflects only the perspective of the real estate industry in Germany. How is it possible to fascinate more young people for these upcoming technologies and prepare them for later professional usage? For sure, education, especially in universities, but also in handicraft training, play a crucial role in this context. The earlier young people get in touch with these technologies, the easier is a potential use in their professional life afterwards. At least, they should know about the technological possibilities and limits in order to evaluate further developments and technological advancements on their own.

5.3 Abstract nature of the metaverse

The perception of the metaverse as a rather abstract topic presents significant barriers to its acceptance. For an industry grounded in physical space and tangible assets, the challenge lies in translating the virtual, often nebulous, concepts of the metaverse into concrete, actionable business strategies. This requires a multifaceted approach that is driven by the users’ needs rather than technological prowess to turn the abstract into something comprehensible. Based on existing experiences from human–computer interaction studies over the last decades, this mandates for several strands that need to get handled and intertwined.

First, pilot programs need to be initiated to build relevant prototypes and proof of concepts in a user-centric fashion. Only by doing so will it be possible to truly reflect on ideas, design decisions, and the usefulness of the chosen approach to identify flaws. This has to be done in an iterative fashion to address the flaws and integrate new ideas, and the process has to include all relevant stakeholders, e.g. from research and development, over sales to customers and business partners. Data and insights from these initiatives can inform larger-scale implementations.

Second, collaborate with other organisations and form strategic partnerships. It is unlikely that in a competitive labour market the required and scarce technological expertise will exist in the own organisation. Therefore, organisation have to form strategic partnerships to leverage external expertise.

Third, once you have results, make sure to raise the awareness about them for an increased understanding and education. This can help stakeholders comprehend the potential benefits and implications for the business. Which should help to allocate sufficient resources, both in terms of budget and talent, to support metaverse initiatives. Allow for evolutionary growth rather than one-off initiatives and identify areas where your initiatives can enhance existing operations rather than disrupting them.

As metaverse-like initiatives are unlikely to replace existing work-practices and interaction metaphors it is important to bear in mind that not everything needs to be 3D—especially if the underlying data is less than 3D, or the current way or working in 2D on planar surfaces is effective and efficient.

As the metaverse can be seen as a “3D Internet”, it makes sense to look at analogies in their development. Early stages are concerned with infrastructure, compute, and connectivity to pass the data around. Then standards appear. Preferably they should be open like HTTP to help building and thriving on top of them, rather than closed and leading to platform capitalism. Maybe the AECO industry can still draw upon Gropius’ Bauhaus manifesto which over 100 years ago postulated a unity of art and craft that conceives the structures of the future, which should be built by a million hands. This can be regarded as an analogy to the evolution of the Internet.

Diverse applications that catered for specific needs followed the Internet standards. More recent developments in social media focussed on user engagement, user profiling, and microtargeting. People get services allegedly for “free” for which they pay with their attention and private data in return. This get marketed to modify our behaviour in exchange for money. Such a business model is unsustainable in the metaverse, although big tech might see that otherwise [ 98 ]. Therefore, it seems paramount to tackle the difficult transfer-problem from academia to industry and also get used to paying for software applications and their development, again [ 99 ].

5.4 Skepticism versus potential for communication

There is a pronounced skepticism among survey participants, yet they simultaneously recognize the metaverse as a potential new foundation for communication within the industry. This contradiction indicates a complex attitude: while there is caution, there is also an understanding that the metaverse could revolutionize interactions with clients and within the industry, suggesting a readiness to explore its capabilities further. Hence, other factors might be more obstructive for the metaverse implementation within the industry. Besides that, more information on the potentials of the metaverse is needed. This gets reflected also in our data: While the construction phase was rated with an average value of 2.68 only, the application areas within that phase received average values greater 3. Hence, the participants saw a certain potential for the application areas but were skeptic whether the overall work in the construction phase is going to change with metaverse usage. Regarding the characterization of the metaverse as a social medium, where people can interact communicate, collaborate, and own property, an average value of 3.72 shows a slight consent among the participants. And 64.38% see the metaverse as a new communication basis in the construction and real estate industry. Is the metaverse thus only a new communication platform for the construction and real estate industry? For sure, it will offer new interaction ways with clients, with other stakeholders and within companies, although there is still a way to go. Communication will not be the only aspect, but one of the key aspects when introducing metaverse technologies to the construction and real estate industry. Here again, low participation hurdles for that communication as well as a multimodal approach supporting different devices with different immersion levels seems to be important for a huge distribution and overall acceptance. Instead of a closed system, open standards and exchange formats need to be considered, especially with regard to the construction and real estate industry, which developed such standards, namely Industry Foundation Classes (IFC) amongst others, for file-based BIM model exchange over the last decades.

5.5 Underreported skepticism

The research indicates that the actual level of skepticism may be higher than what the survey data reveals. This 'dark figure' of skepticism can signify a larger group of industry professionals who are reticent about the metaverse, possibly due to a lack of clear understanding or fear of the unknown. Significant potential for new business models within the life cycle was recognized during the workshop, but at the same time a considerable need for action was also identified. This relates to the necessary programming knowledge, data protection measures, interoperability, and the need for training for correct and targeted application. These aspects were also considered in the subsequent survey. On the one hand, the survey results make it clear that the metaverse is seen by respondents as a desirable basis for communication, particularly in the initiation, planning and marketing phases. On the other hand, the construction phase, the renovation phase, and the administration phase are perceived by the respondents as having less metaverse potential. The perceived potential is limited to linking the metaverse with XR technologies, which are seen as highly coherent by respondents, with 3D visualization seen as optimizing the decision-making process. However, the assessment of potential challenges, particularly of a technical nature, is rated as extremely high. Almost 60% of respondents see data protection as a significant hurdle, while 50% see the possibility of dependence on the digital world as a potential problem in the real world. In addition, the mean score of 4.24 on a scale of 1 to 5 indicates that a certain degree of interoperability in the metaverse is an important technical requirement to ensure successful application in the AECO industry. This interoperability must also be able to cope with regulatory changes and the diversity of construction projects. Furthermore, the interviewees identified challenges such as insufficient user acceptance and a lack of hardware. The practical example of BIM integration illustrates current challenges and makes it clear that the desired interoperability is not available to the extent that the market requires. These difficulties could therefore be transferred to the topic of the metaverse. The results indicate that in the context of the AECO industry, a concise user-friendliness of the metaverse is essential. The implementation of the metaverse requires intuitive and secure handling to arouse the interest of potential users and to recognize and effectively use the comprehensive potential of the metaverse in AECO activities. Only by ensuring user-friendliness and a manual on the correct use of metaverse applications, the metaverse can create added value for the entire AECO industry. Additional workshops with experts from the industry can be of significant relevance for the initial step. These workshops serve to gain concrete insights into the actual potential of the metaverse and at the same time identify the fears and concerns of the target group. By defining and developing appropriate solutions, these workshops can help to create a sound framework for the introduction of the metaverse in the AECO sector. This can include, for example, a transparent and standardized manual and guidelines for handling the metaverse. The aim should be to define a common standard for the metaverse. The choice of contributors to such works is important, as the stakeholder group may also have legal implications in addition to the AECO sector and therefore government members, for example, may need to be involved.

5.6 Enthusiasts' caution

Even among those enthusiastic about the metaverse, there is a visible caution. This can be attributed to the respondents being well-informed about the industry's challenges and perhaps having witnessed the gap between initial technological hype and practical implementation. It can be assumed that only those who are interested in the topic participated in our survey, reflecting a rather conservative attitude of the overall industry. The ongoing digitization in the AECO industry, characterized by the introduction of Building Information Modeling, artificial intelligence, the Internet of Things, and robotics, can also explain the cautious attitude of the respondents. It is possible that the metaverse, with its characteristic features, appears premature for the target group, especially as the need for BIM and other digitization processes in the industry has yet to be definitively defined and established. This dynamic can indicate the need to first focus on consolidating and integrating existing digital technologies before the target group can consider seamlessly integrating the metaverse. It can therefore be seen as a revolutionary innovation within the AECO industry, although it is not yet within reach for the target group due to other ongoing digitization processes. The study by PWC [ 1 ] has shown that there is a discrepancy in the AECO sector between progress in terms of sustainability and digital transformation. While sustainability efforts are progressing, digitization is lagging behind. This discrepancy could explain why skepticism and resistance to digital transformation is evident in the survey conducted. In fact, digital initiatives in the AECO sector are met with resistance, as the example of BIM illustrates. Problems such as a lack of interoperability and high investment costs continue to hamper successful implementation in the sector. In the workshops held, it was recognized that the participants see potential for new business models in the metaverse, but they also emphasized the need for training to promote a common understanding. Interoperability, data protection, the introduction of new hardware and the creation of uniform standards were also cited as barriers to successful implementation. By conducting further studies, holding workshops and integrating the topic into teaching, the fears, uncertainties and potential of industry players can be better identified. This forms a solid basis for creating a uniform understanding, ensuring transparency and better integrating the topic of metaverse into the AECO sector.

5.7 Cost concerns following BIM integration

The integration of Building Information Modeling (BIM) was and still is a considerable investment for the construction industry [ 38 ]. With the metaverse poised as the next big technological venture, there is an evident hesitance to commit further funds. The industry is likely still assessing the ROI from BIM and is cautious about investing in a technology that, while promising, has not yet proven its value for construction and real estate. This might be the largest driver for the above-mentioned skepticism and caution regarding a metaverse for the construction and real estate industry. Especially after the implementation of BIM in most of the companies of the AECO industry, many companies tend to take a wait-and-see attitude towards the metaverse and the resulting costs and interoperability.

On the one hand, this may be because the introduction of other digital technologies and methods, such as the BIM method, has not yet been accompanied by clear regulations on payment for services. This can be illustrated using the example of the BIM method. The BIM methodology ensures that some services that were traditionally carried out in later life cycle phases (e.g., the construction phase) are brought forward to earlier phases (e.g., the design phase). In particular, if the respective companies change after the award and planning is not carried out across the life cycle, this results in disadvantages for the planners in the early phases. For this reason, there should be an adjustment in the remuneration of the various phases, but this has not yet been fully implemented in the AECO industry.

On the other hand, interoperability and the exchange between different software programs is an essential basis for the successful use of digital technologies and methods in the AECO industry. This is also a decisive success criterion for the implementation of the metaverse, as the results on interoperability show. The respondents indicated with an avg. 4.26 that they consider interoperability and interfaces to other data formats to be a technological prerequisite for implementation in the metaverse. This aspect in particular plays a central role in the current discussion about the BIM methodology, as difficulties arise in lifecycle-oriented data exchange as part of the BIM method due to the lack of interfaces or interfaces that have not been implemented neutrally by the software manufacturers.

In conclusion, this discussion underscores the complex, cautious approach the construction and real estate industry has towards the metaverse. It highlights a generational divide, perceptual challenges, and financial considerations. We thus suggest that the metaverse's successful integration requires a more grounded approach that addresses these concerns.

5.8 Metaverse is seen as marketing and communication tool, not as handcraft tool

The results of the survey show, that the main applications are seen in the fields of marketing and visualization as well as collaboration. Especially in marketing the arithmetic mean is constantly over 4.0 (e.g., marketing phase avg. 4.12 or virtual visits of the real estate avg. 4.12). Furthermore, the participants see high potentials in the optimization of communication and to develop new ways to communicate and interact, especially by integrating visualizations. This result can not only be seen in the area of the AECO industry, but also in other industry. Deloitte for example stated in 2016, that the metaverse could support the marketing by providing virtual showrooms and product presentations [ 100 ]. [ 101 ] also stated, that the metaverse will support the processes especially in communications, marketing and teaching [ 101 ].

On the other hand, the results show that the processes in the design and construction phase—that are characterized by craftsmanship—are actually not seen as use cases or application for the integration with the metaverse. The highest average can be found at virtual construction meetings, that might be supported by the metaverse (avg. 3.7), the supervising of the construction site (avg. 3.37) or the tracking of defects (avg. 3.0) are not seen as applications for the metaverse.

This result is also shown in Fig.  17 . The figure shows the average of the answers, where the metaverse could be integrated in the various phases of the buildings’ lifecycle. On the other hand, the figure shows, which characterize the respective phases. While project planning and the planning phase are characterized by designing plans on computers, the construction, redevelopment and demolition phase are characterized by handcraft work. It can be seen that all phases, that are characterized by handcraft work, are below 3.0 (except the redevelopment phase with 3.05), while the communication and computer-work-based phases are all higher than 3.60.

figure 17

Comparison of the different lifecycle phases

That means, that the metaverse is not seen as a tool to support the production or craft on the construction site, but the communication, visualization, and marketing. This may result of the fact, that the AECO industry is seen as a traditional industry, in which changes of processes and tools take a long time. On the other hand, it shows, that the metaverse applications have not arrived yet in the metaverse.

6 Conclusion and outlook

The AECO industry sees potential in the integration of the metaverse into the life cycle. This is seen in particular regarding the life cycle phases and use cases that focus on marketing, communication and visualization. The results show a clearer reluctance in the case of life cycle phases characterized by craftsmanship, which are subject to a certain tradition in the AECO industry.

However, the results also show that young people in particular feel addressed by the metaverse. As a result, the young generation can create a dynamic in the AECO industry that triggers a digital shift. But—and this is also, what the results of the survey show—the implementation of the metaverse in the AECO industry must be carefully prepared and requires change management. Based on experience with other digital technologies that have been introduced in recent years, the results show that costs, interoperability, and the involvement of the various stakeholders from the outset are decisive success criteria. Furthermore, it is necessary to have a clear definition of the term metaverse, as proposed by [ 55 ].

This means, that further research is needed to implement the metaverse in the AECO industry. First, the results should again be discussed with the early adopters, that were interviewed in the workshops. Second, a definition for the metaverse in the AECO industry needs to be set. Hereby, the results of the survey could be a basis. Third, it is necessary to integrate the metaverse more in teaching to educate metaverse natives, that could support the digital shift of the AECO industry. Fourth, the companies of the AECO industry need to get familiar with the metaverse and the possibilities. For this, we envision applied research pilot projects for virtual decentral immersive construction meetings in the metaverse with particular focus on the connection between remote and on-site users. From our point of view, the on-site component of such systems is a core component for a real construction metaverse.

By doing so, the metaverse could support the AECO industry by the transformation to a sustainable and digital industry.

Data availability

Data is provided within the manuscript.

Code availability

Not applicable.

PWC. PwC-Studie zur Baubranche 2023: Die Digitalisierung stockt, in Sachen Nachhaltigkeit geht es voran. https://www.pwc.de/de/pressemitteilungen/2023/pwc-studie-zur-baubranche-2023-die-digitalisierung-stockt-in-sachen-nachhaltigkeit-geht-es-voran.html . Accessed 21 Dec 2023.

European Commission, Ed. Digitalisation in the construction sector: European Construction Sector Observatory. Analytical report, 2021. https://ec.europa.eu/docsroom/documents/45547/attachments/1/translations/en/renditions/native . Accessed 21 Dec 2023.

Oz B. A new perspective in construction management; the metaverse. rdlc. 2023. https://doi.org/10.7764/RDLC.22.2.321 .

Article   Google Scholar  

Musarat MA, Sadiq A, Alaloul WS, Abdul Wahab MM. A systematic review on enhancement in quality of life through digitalization in the construction industry. Sustainability. 2023;15(1):202. https://doi.org/10.3390/su15010202 .

Wijayasekera SC, et al. Data analytics and artificial intelligence in the complex environment of megaprojects: implications for practitioners and project organizing theory. Proj Manag J. 2022;53(5):485–500. https://doi.org/10.1177/87569728221114002 .

Michell K, Brown N, Terblanche J, Tucker J. The effect of disruptive technologies on facilities management: a case study of the industrial sector, pp. 113–23.

Regona M, Yigitcanlar T, Xia B, Li RYM. Opportunities and adoption challenges of AI in the construction industry: a PRISMA review. J Open Innov Technol Mark Complex. 2022;8(1):45. https://doi.org/10.3390/joitmc8010045 .

Aghimien D, Aigbavboa C, Oke A, Thwala W, Moripe P. Digitalization of construction organisations—a case for digital partnering. Int J Constr Manag. 2022;22(10):1950–9. https://doi.org/10.1080/15623599.2020.1745134 .

Nikmehr B, Hosseini MR, Martek I, Zavadskas EK, Antucheviciene J. Digitalization as a strategic means of achieving sustainable efficiencies in construction management: a critical review. Sustainability. 2021;13(9):5040. https://doi.org/10.3390/su13095040 .

Zheng Y, Tang LCM, Chau KW. Analysis of improvement of BIM-based digitalization in engineering, procurement, and construction (EPC) projects in China. Appl Sci. 2021;11(24):11895. https://doi.org/10.3390/app112411895 .

Afzal M, Shafiq MT, Jassmi HA. Improving construction safety with virtual-design construction technologies—a review. ITcon. 2021;26:319–40. https://doi.org/10.36680/j.itcon.2021.018 .

el Mounla K, Beladjine D, Beddiar K, Mazari B. Lean-BIM approach for improving the performance of a construction project in the design phase. Buildings. 2023;13(3):654. https://doi.org/10.3390/buildings13030654 .

Bartels N, Höper J, Theissen S, Wimmer R. Application of the BIM method in sustainable construction: status quo of potential applications in practice. Cham: Springer; 2023.

Google Scholar  

Borrmann A, König M, Koch C, Beetz J. Building information modeling. Cham: Springer; 2018.

Book   Google Scholar  

Borrmann A, König M, Koch C, Beetz J, editors. Building information modeling: Technologische Grundlagen und industrielle Praxis. 2nd ed. Wiesbaden: Springer Vieweg; 2021.

Bartels N. Strukturmodell zum Datenaustausch im facility management. 1st ed. Wiesbaden: Springer Fachmedien Wiesbaden; Imprint Springer Vieweg; 2020.

Wiese M. BIM-Prozess kompakt: Abwicklung eines Bauvorhabens mit der Planungsmethode BIM. Köln: Rudolf Müller; 2019.

Darko A, Chan AP, Yang Y, Tetteh MO. Building information modeling (BIM)-based modular integrated construction risk management—critical survey and future needs. Comput Ind. 2020;123:103327. https://doi.org/10.1016/j.compind.2020.103327 .

Al-Mohammad MS, et al. Factors affecting BIM implementation: evidence from countries with different income levels. CI. 2023;23(3):683–710. https://doi.org/10.1108/CI-11-2021-0217 .

Gao X, Pishdad-Bozorgi P. BIM-enabled facilities operation and maintenance: a review. Adv Eng Inform. 2019;39:227–47. https://doi.org/10.1016/j.aei.2019.01.005 .

Moretti N, Xie X, Merino J, Brazauskas J, Parlikad AK. An openBIM approach to IoT integration with incomplete as-built data. Appl Sci. 2020;10(22):8287. https://doi.org/10.3390/app10228287 .

Tang S, Shelden DR, Eastman CM, Pishdad-Bozorgi P, Gao X. A review of building information modeling (BIM) and the internet of things (IoT) devices integration: present status and future trends. Autom Constr. 2019;101:127–39. https://doi.org/10.1016/j.autcon.2019.01.020 .

Aladağ H, Demirdöğen G, Demirbağ AT, Işık Z. Understanding the perception differences on BIM adoption factors across the professions of AEC industry. Ain Shams Eng J. 2023;14(11):102545. https://doi.org/10.1016/j.asej.2023.102545 .

Bartels N, Hahne K. Teaching building information modeling in the metaverse—an approach based on quantitative and qualitative evaluation of the students perspective. Buildings. 2023;13(9):2198. https://doi.org/10.3390/buildings13092198 .

Wills N, Bartels N. On the applicability of open standard exchange formats for demand-oriented facility management (FM) service delivery in the context of the cross-lifecycle building information modeling (BIM) method, 2022. https://doi.org/10.34749/JFM.2022.4614 .

Shubham S, Saloni S, Sidra-Tul-Muntaha. Optimizing construction processes and improving building performance through data engineering and computation. World J Adv Res Rev. 2023;18(1):390–8. https://doi.org/10.30574/wjarr.2023.18.1.0614 .

Um J, Park J, Park S, Yilmaz G. Low-cost mobile augmented reality service for building information modeling. Autom Constr. 2023;146:104662. https://doi.org/10.1016/j.autcon.2022.104662 .

Amin K, Mills G, Wilson D. Key functions in BIM-based AR platforms. Autom Constr. 2023;150:104816. https://doi.org/10.1016/j.autcon.2023.104816 .

Safari K, AzariJafari H. Challenges and opportunities for integrating BIM and LCA: methodological choices and framework development. Sustain Cities Soc. 2021;67:102728. https://doi.org/10.1016/j.scs.2021.102728 .

Xue K, et al. BIM integrated LCA for promoting circular economy towards sustainable construction: an analytical review. Sustainability. 2021;13(3):1310. https://doi.org/10.3390/su13031310 .

Brokbals S, Čadež I. BIM in der Hochschullehre. Bautechnik. 2017;94(12):851–6. https://doi.org/10.1002/bate.201700100 .

Al-Yami A, Sanni-Anibire MO. BIM in the Saudi Arabian construction industry: state of the art, benefit and barriers. IJBPA. 2021;39(1):33–47. https://doi.org/10.1108/IJBPA-08-2018-0065 .

Maile T, Bartels N, Wimmer R. Integrated life-cycle orientated teaching of the big-open-BIM method. In: Proceedings of the 2023 European conference on computing in construction and the 40th international CIB W78 conference, 2023.

Otto J, Bartels N. Integration von FM-Prozessdaten in ein digitales Gebäudemodell. Bautechnik. 2018;95(12):823–31. https://doi.org/10.1002/bate.201800044 .

Saka AB, Chan DW. Profound barriers to building information modelling (BIM) adoption in construction small and medium-sized enterprises (SMEs). CI. 2020;20(2):261–84. https://doi.org/10.1108/CI-09-2019-0087 .

Chan DW, Olawumi TO, Ho AM. Perceived benefits of and barriers to building information modelling (BIM) implementation in construction: the case of Hong Kong. J Build Eng. 2019;25:100764. https://doi.org/10.1016/j.jobe.2019.100764 .

van Roy AF, Firdaus A. Building information modelling in Indonesia: knowledge, implementation and barriers. JCDC. 2020;25(2):199–217. https://doi.org/10.21315/jcdc2020.25.2.8 .

Bialas F, Wapelhorst V, Brokbals S, Čadež I. Quantitative Querschnittsstudie zur BIM-Anwendung in Planungsbüros. Bautechnik. 2019;96(3):229–38. https://doi.org/10.1002/bate.201800103 .

Ahmed S. Barriers to implementation of building information modeling (BIM) to the construction industry: a review. J Civ Eng Constr. 2018;7(2):107. https://doi.org/10.32732/jcec.2018.7.2.107 .

Krämer M, Besenyői Z. Towards digitalization of building operations with BIM. IOP Conf Ser Mater Sci Eng. 2018;365:22067. https://doi.org/10.1088/1757-899X/365/2/022067 .

Stojanovska-Georgievska L, et al. BIM in the center of digital transformation of the construction sector—the status of BIM adoption in North Macedonia. Buildings. 2022;12(2):218. https://doi.org/10.3390/buildings12020218 .

Sompolgrunk A, Banihashemi S, Mohandes SR. Building information modelling (BIM) and the return on investment: a systematic analysis. CI. 2023;23(1):129–54. https://doi.org/10.1108/CI-06-2021-0119 .

Bartels N, Wimmer R. BIM von A bis Z etablieren, inklusive „Tuning“-Maßnahmen: Teil 1: AIA und BAP. tab, no. 12, 2023. pp. 32–6.

Akcay EC. Analysis of challenges to BIM adoption in mega construction projects. IOP Conf Ser Mater Sci Eng. 2022;1218(1):12020. https://doi.org/10.1088/1757-899X/1218/1/012020 .

Article   MathSciNet   Google Scholar  

Elhendawi A, Omar H, Elbeltagi E, Smith A. Practical approach for paving the way to motivate BIM non-users to adopt BIM. Int J BIM Eng Sci. 2020;2(2):1–22.

Stephenson N. Snow crash. New York: Bantam Books; 1992.

Qi P, Chen Z. The origin, characteristics and prospect of metaverse. AEHSSR. 2022;1(1):315. https://doi.org/10.56028/aehssr.1.1.315 .

Monaco S, Sacchi G. Travelling the metaverse: potential benefits and main challenges for tourism sectors and research applications. Sustainability. 2023;15(4):3348. https://doi.org/10.3390/su15043348 .

Chow Y-W, Susilo W, Li Y, Li N, Nguyen C. Visualization and cybersecurity in the metaverse: a survey. J Imaging. 2022. https://doi.org/10.3390/jimaging9010011 .

Weinberger M. What is metaverse?—A definition based on qualitative meta-synthesis. Future Internet. 2022;14(11):310. https://doi.org/10.3390/fi14110310 .

Gadekallu TR, et al. Blockchain for the metaverse: a review, 2022. http://arxiv.org/pdf/2203.09738v2 .

Mystakidis S. Metaverse. Encyclopedia. 2022;2(1):486–97. https://doi.org/10.3390/encyclopedia2010031 .

Möller P, Wenz D. Metaverse: Statistiken und Zahlen (2022). Metaverse: Statistiken und Zahlen (2022) - Bitcoin2Go (bitcoin-2go.de). Accessed 20 Nov 2023.

Fernandez CB, Hui P. Life, the metaverse and everything: an overview of privacy, ethics, and governance in metaverse, pp. 272–7.

Buchholz F, Oppermann L, Prinz W. There’s more than one metaverse. i-com. 2022;21(3):313–24. https://doi.org/10.1515/icom-2022-0034 .

Huynh-The T, et al. Blockchain for the metaverse: a review. Future Gener Comput Syst. 2023;143:401–19. https://doi.org/10.1016/j.future.2023.02.008 .

Radoff J. The metaverse value-chain. https://medium.com/building-the-metaverse/the-metaverse-value-chain-afcf9e09e3a7 . Accessed 20 Nov 2023.

Benford S. Metaverse: five things to know—and what it could mean for you. https://theconversation.com/metaverse-five-things-to-know-and-what-it-could-mean-for-you-171061 . Accessed 20 Nov 2023.

Chen Z. Metaverse office: exploring future teleworking model. K. 2023. https://doi.org/10.1108/K-10-2022-1432 .

Milgram P, Takemura H, Utsumi A, Kishino F. Augmented reality: a class of displays on the reality-virtuality continuum. In: Telemanipulator and Telepresence Technologies, Boston, MA, 1995, pp. 282–92.

Alsop T. Extended reality (XR) market size worldwide from 2021 to 2026. https://www.statista.com/statistics/591181/global-augmented-virtual-reality-market-size/ . Accessed 20 Nov 2023.

Maereg AT, Nagar A, Reid D, Secco EL. Wearable vibrotactile haptic device for stiffness discrimination during virtual interactions. Front Robot AI. 2017. https://doi.org/10.3389/frobt.2017.00042 .

Atsikpasi P, Fokides E. A scoping review of the educational uses of 6DoF HMDs. Virtual Real. 2022;26(1):205–22. https://doi.org/10.1007/s10055-021-00556-9 .

Slater M, Sanchez-Vives MV. Enhancing our lives with immersive virtual reality. Front Robot AI. 2016. https://doi.org/10.3389/frobt.2016.00074 .

Pellas N, Dengel A, Christopoulos A. A scoping review of immersive virtual reality in STEM education. IEEE Trans Learn Technol. 2020;13(4):748–61. https://doi.org/10.1109/TLT.2020.3019405 .

Pellas N, Mystakidis S, Kazanidis I. Immersive virtual reality in K-12 and higher education: a systematic review of the last decade scientific literature. Virtual Real. 2021;25(3):835–61. https://doi.org/10.1007/s10055-020-00489-9 .

Ibáñez M-B, Delgado-Kloos C. Augmented reality for STEM learning: a systematic review. Comput Educ. 2018;123:109–23. https://doi.org/10.1016/j.compedu.2018.05.002 .

Klopfer E. Augmented learning: research and design of mobile educational games. Cambridge: MIT Press; 2008.

Mystakidis S, Christopoulos A, Pellas N. A systematic mapping review of augmented reality applications to support STEM learning in higher education. Educ Inf Technol. 2022;27(2):1883–927. https://doi.org/10.1007/s10639-021-10682-1 .

McKinsey & Company. Value creation in the metaverse. https://www.mckinsey.com/capabilities/growth-marketing-and-sales/our-insights/value-creation-in-the-metaverse . Accessed 20 Nov 2023.

Asara C. Real estate in the metaverse, 2022. https://www.politesi.polimi.it/retrieve/e1edc3e1-c9c2-4286-a70e-097a32a90f2e/2022_12_Asara.pdf . Accessed 20 Nov 2023.

Alkhaldi N. Real estate in the metaverse: market trends, opportunities, and tips for technology enthusiasts. https://itrexgroup.com/blog/real-estate-in-the-metaverse-market-trends-opportunities-tips/#heade . Accessed 20 Nov 2023.

Deloitte. Real estate in the metaverse- risk or opportunity? https://www2.deloitte.com/content/dam/Deloitte/nl/Documents/real-estate/deloitte-nl-real-estate-re-in-the-metaverse.pdf . Accessed 20 Nov 2023.

Nieradka P. Using virtual reality technologies in the real estate sector. h. 2019;53(2):45. https://doi.org/10.17951/h.2019.53.2.45-53 .

Brenner AJ. Virtual reality: the game changer for residential real estate staging through increased presence. Claremont McKenna College, 2016. https://scholarship.claremont.edu/cgi/viewcontent.cgi?article=2568&context=cmc_theses . Accessed 15 Nov 2023.

Sihi D. Home sweet virtual home. JRIM. 2018;12(4):398–417. https://doi.org/10.1108/JRIM-01-2018-0019 .

Lee JY. A study on metaverse hype for sustainable growth, 2021. https://koreascience.kr/article/JAKO202128054633800.pdf .

Huynh-The T, Pham Q-V, Pham X-Q, Nguyen TT, Han Z, Kim D-S. Artificial intelligence for the metaverse: a survey. Eng Appl Artif Intell. 2023;117:105581. https://doi.org/10.1016/j.engappai.2022.105581 .

Ramu SP, et al. Federated learning enabled digital twins for smart cities: concepts, recent advances, and future directions. Sustain Cities Soc. 2022;79:103663. https://doi.org/10.1016/j.scs.2021.103663 .

Hyve. XR, AR, VR, MR – hinter Abkürzungen verbergen sich Welten . https://www.hyve.net/de/blog/all-about-virtual-reality/ . Accessed 23 Nov 2023.

Waqar A, et al. Analyzing the success of adopting metaverse in construction industry: structural equation modelling. J Eng. 2023;2023:1–21. https://doi.org/10.1155/2023/8824795 .

Hernandez R, Rivet E, Priest D, Panjwani V, Korizis G, Likens S. Demystifying the metaverse: what business leaders need to know and do. https://www.pwc.com/us/en/tech-effect/emerging-tech/demystifying-the-metaverse.html . Accessed 10 Nov 2023.

Waqar A, et al. Effect of Coir Fibre Ash (CFA) on the strengths, modulus of elasticity and embodied carbon of concrete using response surface methodology (RSM) and optimization. Results Eng. 2023;17:100883. https://doi.org/10.1016/j.rineng.2023.100883 .

Waqar A, et al. Effect of volcanic pumice powder ash on the properties of cement concrete using response surface methodology. J Build Rehabil. 2023. https://doi.org/10.1007/s41024-023-00265-7 .

Waqar A, Othman I, Skrzypkowski K, Ghumman ASM. Evaluation of success of superhydrophobic coatings in the oil and gas construction industry using structural equation modeling. Coatings. 2023;13(3):526. https://doi.org/10.3390/coatings13030526 .

Wang X, Wang J, Wu C, Xu S, Ma W. Engineering brain: metaverse for future engineering. AI Civ Eng. 2022. https://doi.org/10.1007/s43503-022-00001-z .

Chen Y, Wang X, Liu Z, Cui J, Osmani M, Demian P. Exploring building information modeling (BIM) and Internet of Things (IoT) integration for sustainable building. Buildings. 2023;13(2):288. https://doi.org/10.3390/buildings13020288 .

Waqar A, Othman I, Almujibah H, Khan MB, Alotaibi S, Elhassan AAM. Factors influencing adoption of digital twin advanced technologies for smart city development: evidence from Malaysia. Buildings. 2023;13(3):775. https://doi.org/10.3390/buildings13030775 .

Zhao X, Lu Q. Governance of the metaverse: a vision for agile governance in the future data intelligence world. J Libr Sci China. 2022;67:113–28.

IEEE P802.22/D5, February 2019: IEEE draft standard for spectrum characterization and occupancy sensing. [Place of publication not identified]: IEEE, 2019.

Waqar A, Othman I, Pomares JC. Impact of 3D printing on the overall project success of residential construction projects using structural equation modelling. Int J Environ Res Public Health. 2023. https://doi.org/10.3390/ijerph20053800 .

Baghalzadeh Shishehgarkhaneh M, Keivani A, Moehler RC, Jelodari N, Roshdi Laleh S. Internet of Things (IoT), building information modeling (BIM), and digital twin (DT) in construction industry: a review, bibliometric, and network analysis. Buildings. 2022;12(10):1503. https://doi.org/10.3390/buildings12101503 .

Wang G, Shin C. Influencing factors of usage intention of metaverse education application platform: empirical evidence based on PPM and TAM models. Sustainability. 2022;14(24):17037. https://doi.org/10.3390/su142417037 .

Chen Y, Huang D, Liu Z, Osmani M, Demian P. Construction 4.0, Industry 4.0, and building information modeling (BIM) for sustainable building development within the smart city. Sustainability. 2022;14(16):10028. https://doi.org/10.3390/su141610028 .

Bian L, Xiao R, Lu Y, Luo Z. Construction and design of food traceability based on blockchain technology applying in the metaverse, pp. 294–305.

Saker M, Frith J. Contiguous identities. FM. 2022. https://doi.org/10.5210/fm.v27i3.12471 .

Hellweg M. Bedeutung der Immobilienbranche. https://zia-deutschland.de/project/bedeutung-der-immobilienbranche/ . Accessed 20 Dec 2023.

Lanier J. Who owns the future? London: Penguin Books; 2014.

Dwivedi YK, et al. The difficult journey: the challenges of manifesting the impact of academic research in practice, policy and society. Int J Inf Manag. 2024 ( to appear ).

Esser R, Oppermann L, Lutter T. Head mounted displays in deutschen Unternehmen: Ein virtual, augmented und mixed reality check. Düsseldorf/Berlin, 2016. https://www2.deloitte.com/de/de/pages/technology-media-and-telecommunications/articles/head-mounted-displays-in-deutschen-unternehmen.html . Accessed 19 Dec 2023.

Kind S, Ferdinand J-P, Jetzke T, Richter S, Weide S. Virtual und augmented reality: status quo, Herausforderungen und zukünftige Entwicklungen. Karlsruhe, TAB-Arbeitsbericht 180, 2019.

Download references

Author information

Authors and affiliations.

TH Köln - University of Applied Sciences, Betzdorfer Str., 50679, Cologne, Germany

Hannah Claßen & Niels Bartels

Fraunhofer FIT, Schloss Birlinghoven 1, 53757, Sankt Augustin, Germany

Urs Riedlinger & Leif Oppermann

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization: NB, UR, HC; methodology: NB; formal analysis and investigation: HC, NB, UR; writing—original draft preparation: HC, UR, NB, LO; writing—review and editing: HC, UR, NB, LO; supervision: NB, LO.

Corresponding authors

Correspondence to Niels Bartels or Urs Riedlinger .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: questionnaire

FT = Free Text, LS = Likert Scale, LSM = Likert Scale Matrix, MC = Multiple Choice, SC = Single Choice.

To which age group do you belong? [SC]

16–29 years

30–40 years

41–50 years

51–60 years

61–70 years

71–80 years

Are you a student or a professional? [SC]

Professional

(only if Q2 student) In which degree programme are you enrolled? [FT]

(only if Q2 professional) What function do you perform in the real estate industry? [SC]

Civil engineer

Planning of technical building equipment

Structural engineering

Project development

Project management

Facility Management

Property Management

Investmentmanagement

Other: [FT]

How confident do you feel in using the term metaverse? [LS: 5 = very safe, 1 = very unsafe]

What characteristics define the term metaverse for you? [LSM: 5 = fully agree, 1 = fully disagree]

A metaverse is a social medium where people can interact, communicate, collaborate and own property.

A metaverse is a combination of virtual worlds and augmented real worlds.

A metaverse is persistent and long-lasting.

A metaverse is an integrated system that incorporates and uses XR and other technologies.

Capturing the state of the user and the real environment is a key action for metaverse applications.

Metaverse participation is multimodal and can take place with different intensities and representations, such as embodiment through avatars.

A metaverse is closely coupled with reality.

How much do you think the metaverse will change the work in the following life cycle phases? [LSM: 5 = fully agree, 1 = fully disagree]

Project planning/initiation phase

Planning phase

Construction phase

Marketing phase

Redevelopment phase

Demolition phase

Do you see the Metaverse as a new communication basis in the construction and real estate industry? [SC]

Other [Free Text]

Which areas of application in the real estate industry, especially for the planning phase, could be relevant? [LSM: 5 = fully agree, 1 = fully disagree]

Digital purchase and sale of real estate in the metaverse

Collaborative cooperation of all stakeholders concerned on a building model

Simulation of the user profile of the planned building model, of hazardous situations during the use phase in the form of training (e.g. fire protection), sustainability balances

Virtual property inspections using XR technologies

Digital real estate planning as a digital twin enables real-time data that supports decision-making

Support with the digital building application and coordination with approval authorities

Integrating cubatures into the landscape

Which areas of application in the real estate industry, especially in the construction phase, could be relevant? [LSM: 5 = fully agree, 1 = fully disagree]

Virtual/collaborative construction project planning/site meetings

Use tools in the metaverse to create and check building structures and designs in real time.

Construction monitoring and actual state recordings by means of drones for live transmission to the Metaverse

Simulation of construction processes, hazard analysis, interface analysis (e.g. with other subcontractors

Safety training for employees on the construction site

Coordination of resource requirements by means of simulation and digital construction project planning

Coordination of logistics by means of simulation and digital construction project planning

Tracking of construction defects in the metaverse

Which areas of application in the real estate industry, especially in the operating phase, could be relevant? [LSM: 5 = fully agree, 1 = fully disagree]

Virtual facility management: management of buildings in the metaverse, including maintenance, servicing, repairs, space optimisation, energy management

Building automation control

Explanation of safety measures, building introduction for users/occupants

The metaverse serves as an interaction community for the users/residents

Use as indoor navigation

Simulation of media flows

Which areas of application in the real estate industry, especially in the marketing phase, could be relevant? [LSM: 5 = fully agree, 1 = fully disagree]

Virtual property viewings to market the property

Consultation and sales talks in the metaverse

Due Diligence in the Metaverse

Which areas of application in the real estate industry, especially in the demolition phase, could be relevant? [LSM: 5 = fully agree, 1 = fully disagree]

Digital twin in the Metaverse supports the recycling and recovery process of used substances and materials

Coordination of the demolition concept

What challenges do you see in relation to the metaverse? [MC]

Data protection and privacy

Dependence on the digital world

Lack of user acceptance

Social isolation

Lack of social interaction

Fears of cyber attacks

Fraud, e.g. through data theft

Monopoly position of a provider

Insufficient hardware equipment (e.g. VR / AR glasses

Slow internet connection

Insufficient software equipment

Too much fusion of digital and real properties (keyword: who is the owner?)

Health problems (e.g. fatigue from using XR, motion sickness).

High implementation costs

Digital currencies (Bitcoin, etc.)

What can be the strengths of using Metaverse in the real estate industry? [LSM: 5 = fully agree, 1 = fully disagree]

Increasing the efficiency of process flows within the life cycle phases

Expanded communication basis of all actors concerned, using the example of collaborative cooperation

3D visualisations in the metaverse simplify decision making

Open communication structures

Communication independent of time and place

Flexibilisation in the course of the life cycle

Improving understanding through enhanced visualization

Increased sense of reality

Which technological prerequisites are essential for implementing the metaverse in the life cycle? [LSM: 5 = fully agree, 1 = fully disagree]

A reliable IT infrastructure

Adequate security solutions

Scalability of the technology for growing demand

Interoperability and interfaces to other data formats

Interoperability and interfaces to other hardware products

Interoperability and interfaces to other software products

What personal knowledge is essential for implementing the metaverse in the life cycle? [LSM: 5 = fully agree, 1 = fully disagree]

Technical know-how (programming languages, technology, etc.

Understanding security risks

Adaptability for changes in technology

Analytical skills to analyse data and gain insights from it

Do you have any other suggestions or comments on the topic of metaverse in the real estate industry that you would like to share? [FT]

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Claßen, H., Bartels, N., Riedlinger, U. et al. Transformation of the AECO industry through the metaverse: potentials and challenges. Discov Appl Sci 6 , 461 (2024). https://doi.org/10.1007/s42452-024-06162-z

Download citation

Received : 08 April 2024

Accepted : 21 August 2024

Published : 27 August 2024

DOI : https://doi.org/10.1007/s42452-024-06162-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Advertisement

  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. (PDF) China’s emerging data protection framework

    research paper on data protection

  2. Data Protection in Ireland: Laws and Responsibilities Free Essay Example

    research paper on data protection

  3. Sample Data Protection Policy Template

    research paper on data protection

  4. (PDF) Data protection, scientific research, and the role of information

    research paper on data protection

  5. Data protection tools for researchers

    research paper on data protection

  6. Data Protection Policy Template

    research paper on data protection

COMMENTS

  1. Data protection, scientific research, and the role of information

    Introduction. This paper aims to critically assess the information duties set out in the General Data Protection Regulation (GDPR) and national adaptations when the purpose of processing is scientific research. Due to the peculiarities of the legal regime applicable to the research context, information about the processing plays a crucial role ...

  2. (PDF) Privacy and Data Protection

    Abstract. Against the background of the centrality of data for contemporary economies, the chapter contributes to a better understanding and contextualization of data protection and its interfaces ...

  3. Data Security and Privacy: Concepts, Approaches, and Research

    Data are today an asset more critical than ever for all organizations we may think of. Recent advances and trends, such as sensor systems, IoT, cloud computing, and data analytics, are making possible to pervasively, efficiently, and effectively collect data. However for data to be used to their full power, data security and privacy are critical. Even though data security and privacy have been ...

  4. Complete and Effective Data Protection

    1. Introduction. The right to data protection enjoys a privileged position in the EU legal order. 1 The right is strictly interpreted by the Court of Justice of the EU (CJEU) and is given remarkable weight when balanced with other rights and interests. 2 While data protection sits alongside the more established right to respect for private life in the EU Charter, 3 it is data protection rather ...

  5. Full article: Online Privacy Breaches, Offline Consequences

    Over 30 years ago, Mason (Citation 1986) voiced ethical concerns over the protection of informational privacy, or "the ability of the individual to personally control information about one's self" (Stone et al., Citation 1983), calling it one of the four ethical issues of the information age.Since the 1980s, scholars have remained concerned about informational privacy, especially given ...

  6. Privacy Protection and Secondary Use of Health Data: Strategies and

    Three strategies are summarized in this section. The first is for clinical data and provides a practical user access rating system, and the second is majority for genomic data and designs a network architecture to address both security access and potential risk of privacy disclosure and reidentification.

  7. The European Union general data protection regulation: what it is and

    33 LIBE Compromise, proposal for a Data Protection Regulation (this paper refers to the unofficial Consolidated Version after LIBE Committee Vote, provided by the Rapporteur, General Data Protection Regulation, 22 October 2013. The European Parliament is an EU body with legislative, supervisory, and budgetary responsibilities. ... Amsterdam Law ...

  8. Data protection and research: A vital challenge in the era of COVID-19

    The issue of data protection in research is becoming of pivotal importance, in particular in the last months with the pandemic emergency of COVID-19. 1 Studying the development of the outbreak on affected populations under a scientific and statistic perspective is necessary to understand the trend of contagion, the effectiveness of social distancing measures, the most vulnerable people who are ...

  9. Science and Privacy: Data Protection Laws and Their Impact on Research

    Scientists believe this data-driven approach to research will lead to stunning breakthroughs in medicine, education, and many other fields that can dramatically advance human knowledge and well-being. The tension between these two trends is clear. Most privacy laws acknowledge and address that tension. While privacy laws aim to restrict harmful ...

  10. (PDF) Data protection and research: A vital challenge in the era of

    The issue of data protection in research is becoming of piv-. otal importance, in particular in the last months with the pan-. demic emergency of COVID-19.1. Studying the development of. the ...

  11. Privacy Prevention of Big Data Applications: A Systematic Literature

    This paper focuses on privacy and security concerns in Big Data. This paper also covers the encryption techniques by taking existing methods such as differential privacy, k-anonymity, T-closeness, and L-diversity.Several privacy-preserving techniques have been created to safeguard privacy at various phases of a large data life cycle.

  12. The Effects of Privacy and Data Breaches on Consumers' Online Self

    Five major streams of research inform our work in this paper: (1) technology adoption model (TAM), (2) consumer privacy paradox, (3) service failure, (4) protection motivation theory (PMT), and (5) trust. ... Third, most research on data breaches has focused mainly on post-breach analysis, that is, the impact of data breach. ... Both CCPA and ...

  13. Why and how we should care about the General Data Protection Regulation

    The General Data Protection Regulation (GDPR) is the new European Union-wide (EU) law on data protection, which entered into force on 25 May 2018. It is a great step towards more comprehensive and more far-reaching protection of individuals' personal data. In this editorial, we briefly explain how the GDPR sets out to strengthen the ...

  14. (PDF) Privacy Issues and Data Protection in Big Data: A Case Study

    This paper describes privacy issues in big data analysis and. elaborates on two case studies (government-funded projects 2,3 ) in order to elucidate how legal priv acy requirements can be met. in ...

  15. Data Protection and Privacy Law: An Introduction

    provides an introduction to data protection laws and an overview of considerations for Congress. (For a more detailed analysis, see CRS Report R45631, Data Protection Law: An Overview, by Stephen P. Mulligan, Wilson C. Freeman, and Chris D. Linebaugh.) Defining Data Protection As a legislative concept, data protection melds the fields of

  16. PDF Rethinking Privacy

    Suggestion 1: Denormalize data collection by default 33. Suggestion 2: Focus on the AI data supply chain to improve privacy and data protection 36. Suggestion 3: Flip the script on the management of personal data 41. Chapter 5: Conclusion 45. Endnotes 46.

  17. The Normative Power of the GDPR: A Case Study of Data Protection Laws

    The increased dependency on technology brings national security to the forefront of concerns of the 21st century. It creates many challenges to developing and developed nations in their effort to counter cyber threats and adds to the inherent risk factors associated with technology. The failure to securely protect data would potentially give rise to far-reaching catastrophic consequences ...

  18. Data Protection and Consumer Protection: The Empowerment of the ...

    This chapter explores the alignment of the EU data protection and consumer protection policy agendas through a discussion of the reference to the Unfair Contract Terms Directive in Recital 42 of the General Data Protection Regulation. ... Australian National University College of Law Legal Studies Research Paper Series. Subscribe to this free ...

  19. Privacy in the digital age: comparing and contrasting ...

    This paper takes as a starting point a recent development in privacy-debates: the emphasis on social and institutional environments in the definition and the defence of privacy. Recognizing the merits of this approach I supplement it in two respects. First, an analysis of the relation between privacy and autonomy teaches that in the digital age more than ever individual autonomy is threatened ...

  20. Data Privacy and Data Protection: The Right of User's and the ...

    This paper is divided into three parts with the first discussing the rights of users and responsibilities of companies as well as the established regulations in the protection of data. The second part of this work considers the issues surrounding data privacy and data protection and the challenges faced in ensuring the safety of users ...

  21. Confidentiality and Data Protection in Research

    In the research time, the Principal Investigator is ultimately responsible for the integrity of the stored data. The data protections and confidentiality protocols should be in place before the project starts, and includes aspects like theft, loss or tampering of the data. The easy way to do this is to limit access to the research data.

  22. Big Data Security and Privacy Protection

    In view of the wide application and popularization of large data, more and more data security and privacy issues have brought great challenges to the development of large data. Starting from the characteristics of big data, this paper analyses various risks of information security, and puts forward the corresponding development strategy of big data security. The results show that the ...

  23. (PDF) AI in Data Privacy and Security.

    safeguard your organizations systems and data [30]. In general, AI play an important role in. developing specific technologies to support data privacy and security by detecting suspicious ...

  24. Public attitudes towards personal health data sharing in long-term

    Discussion Group 1, KORA participant: "My concerns are (…) data protection and data usage. Not particularly in relation to Helmholtz Munich, but the overall (…) misuse, data hackers, cybercrime, all that stuff. And that will increase even more in the future." Discussion Group 2, Citizen: "…it is always difficult with data protection in an international comparison.

  25. Data Privacy and Protection

    Adopt a unified approach to data security posture management across repositories, file shares, cloud data warehouses, databases, and applications. Minimize your data risk Discover, understand, and secure sensitive and high-value data, as well as govern data access and usage over the data lifecycle.

  26. Transformation of the AECO industry through the metaverse ...

    The research questions of this paper are aimed at identifying the range of metaverse applications in the AECO industry, assessing their potential impact on business potential and challenges. ... Almost 60% of respondents see data protection as a significant hurdle, while 50% see the possibility of dependence on the digital world as a potential ...