Talk:Data quality

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

All 60 dimensions of data quality in one list[edit]

My suggestion is to add the following phrase to the chapter about data quality dimensions:

In a later study, sixty dimensions of data quality were inventoried, comparing definitions from different sources. In this study, the definitions were formulated so that they comply with the ISO 704 standard. The result is a list of 60 standardised definitions of dimensions of data quality. Also added to each dimension is which data concept the dimension belongs to. This way, a distinction can be made, for instance, between the completeness of values, the completeness or records and the completeness of attributes. https://www.dama-nl.org/wp-content/uploads/2020/09/DDQ-Dimensions-of-Data-Quality-Research-Paper-version-1.2-d.d.-3-Sept-2020.pdf

Do you agree?

Pndt (talk) 18:11, 1 December 2022 (UTC)[reply]

the first sentence[edit]

the first sentence isn't exactly exploding with useful information is it?


True. But it does distinguish between "quality of data" and "data about quality" (of some other process.) For example, people making car parts are interested in the quality of those parts, and gather data (measurements) about them. This article is not about that.



I have to agree with the first comment, the first sentence was a mental turn-off to me. As a reader it made me think 'well, duh'. I don't think Data Quality would be confused with Quality Data as you suggest either. The following sentences are good summaries - perhaps you could re-write to improve the immediate readability.

The reference to NCOA is rather specific to the US, could you add 'In the USA...' or some other clarifying text? --81.146.33.135 17:38, 11 June 2007 (UTC)[reply]


There are plenty of other definitions of Data Quality around. E.g., "Data quality is the measure of the agreement between the data views presented by an information system and that same data in the real world.", which can be found in "ORR, Ken. Data quality and systems theory. Communications of the ACM, 1998, 41.2: 66-71.". Should we disambiguate some definitions in the beginning?

Pietercolpaert (talk) 10:16, 24 March 2014 (UTC)[reply]


The introduction could be revamped to add clarity to the page topic. Would something along these lines be appropriate?: 'Data quality is a multi-dimensional subject that focuses on the fitness of a given set of data for use. This complexity has resulted in multiple definitions of data quality including: ...'

Then the different sources can be quoted with the specific definitions (and probably add and categorize more definitions than are currently included). 205.214.190.129 (talk) 16:24, 2 August 2016 (UTC)[reply]

External links[edit]

I've moved the following links here from the main page:

  • www.b-eye-network.com?offer=Wikipedia Business Intelligence Network Comprehensive online resource for data warehouse, data quality and business performance management professionals.
  • searchcrm.techtarget.com?offer=Wikipedia SearchCRM.com Original daily news, webcasts, expert advice, white papers and more resources on data quality.
  • www.dmreview.com/portals/portal.cfm?topicId=230005 DMReview.com Data Quality Portal for practitioners, containing articles, whitepapers and webinars related to data quality.
  • www.infoimpact.com/ Larry English's Homepage Conferences, books, whitepapers, case studies and consulting - Larry English does it all.
  • ghill.customer.netspace.net.au/iq_attr.html Interactive IQ Explorer For the exploration of data quality concepts, dimensions and attributes.

If you'd like to add some of them back per Wikipedia:External links please discuss it here.
brenneman(t)(c) 00:20, 22 July 2005 (UTC)[reply]



The last two links have some merit:

  • While kind of lame, the IQ explorer does list over 170 IQ attributes from the academic and practitioner literature. This is useful to readers wanting to get a feel for how poorly-defined, overlapping and wide-open these concepts are. It might even prompt someone to drag this list into wikipedia, as per the Ilities.
  • The link to Larry English is worth including, since he was mentioned in the body of the article and he has a large amount of practical material (talks, tutorials, seminars, articles and books) on his site.



Refering back to the first comment[edit]

Refering back to the first comment above, perhaps this observation might be useful

If the definition of data quality is the perception of the fitnes for purpose by a data consumer, then if I presented a set of data that was accurate, complete, relevant and timely, (fulfilling the second given criteria for data quality) but if the data consumer did not trust, understand, be able to manipulate that data in the way they wanted (or some other subjective perspectives)- that data would not be fit for purpose, and therfore of low quality. Does this not emphasize the importance of presentation of data is vital to get right, and that user attitudes like trust are just as relevant as getting accuracy correct? If we roll out information systems without due regard to the "soft system" subjective issues, we risk failure.

I think the definitions and the contrast between them is very useful.

Reference section[edit]

This is currently just "external links" in another guise. If someone wants to convert this into actual refernces per Wikipedia:Footnotes that's great, otherwise I'll take them out. - brenneman{L} 07:19, 21 April 2006 (UTC)[reply]

Broken Link[edit]

The GIS Glossary Link is broken. It leads to a missing page. The current URL is: http://www.fw.umn.edu/FW5620/glossary.htm

(A little ironic on a Data Quality Article)


Additional Broken Links as at 10/25/2017[edit]

Just had a quick and non-exhaustive look at some links;

4 "IAIDQ--glossary"

5 "Government of British Columbia"

10 "Address Management for Mail-Order and Retail"

11 http://ribbs.usps.gov/move_update/documents/tech_guides/PUB363.pdf


Also the link for 3 '"Data Quality: High-impact Strategies - What You Need to Know: Definitions, Adoptions, Impact, Benefits, Maturity, Vendors". Retrieved 5 February 2013.' is set to private so not sure that can be used as a source?

Andypolack (talk) 13:54, 25 October 2017 (UTC)[reply]

Reference that should be removed[edit]

The no.1 reference: "Data Quality: High-impact Strategies - What You Need to Know: Definitions, Adoptions, Impact, Benefits, Maturity, Vendors"., should be removed. It is self promotion for a copy-paste book of free information (mainly wikipedia, as it turns out according to amazon reviews). 194.138.12.167 (talk) 13:04, 8 October 2013 (UTC)[reply]

I second this. See also this Blog Article: http://infosecisland.com/blogview/17693-Theres-a-Sucker-Born-Every-Minute--and-Charlatans-to-Make-Sure-They-Pay-for-It.html — Preceding unsigned comment added by 128.176.159.37 (talk) 13:52, 11 September 2014 (UTC)[reply]

Self Promoting Rubbish[edit]

I have seldom read a Wikipedia article that is more self promoting and less informational than this tripe allegedly on Data Quality. Come on guys - you are hardly Codd mentioned on databases or Kimball on Data Warehousing. Knock it off - it does nothing but discredit you. Lets have some proper definitions of the problem and its component parts, a mention of the main associations, methodologies and tools and leave it that that? — Preceding unsigned comment added by 84.92.230.173 (talk) 17:47, 17 October 2013 (UTC)[reply]

Needs to be completely rewritten?[edit]

In an industry where there have been significant changes to this concept, this article has not kept up (and was not high quality to begin with).

A better structure for this article would be:

  • Defnition(s)
  • DQ Elements (consolidated list from multiple current sources) - A list of attributes that are used to define the quality level
  • DQ Measurement - Discussion about different ways quality is measured
  • DQ Functions - The functions that need to occur to produce high quality data - these should just be links to other relevant articles rather than expanded discussion in this article, as these functions themselves are complex topics
  • References
  • Further Reading

205.214.190.129 (talk) 16:40, 2 August 2016 (UTC)[reply]

Merge from Information quality[edit]

While some do distinguish the two concepts theoretically, in practice (and in our wikipedia articles) the two are thoroughly intermixed. A brief section describing the distinction will do. Staszek Lem (talk) 20:04, 22 August 2016 (UTC)[reply]

There's pretty good reason for keeping them separate, at least in their current states. The Data Quality article really does deal with data: raw facts and figures. The Information quality does deal (mostly) with information: Interpreted facts, evidence, lines of argument, scoped arguments. Information Quality mentions, for example, criteria for quality such as "believable, concise, complete, consistently represented" -- all of which make sense for information but not for data.
I vote keep them separate or you get a Frankenstein article that means nothing to anyone.
Steve Rapaport (talk) 23:17, 5 May 2017 (UTC)[reply]
Closing, given the lack of support over 18 months and the uncontested opposition. Klbrain (talk) 17:39, 2 April 2018 (UTC)[reply]

External links modified[edit]

Hello fellow Wikipedians,

I have just modified one external link on Data quality. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 06:26, 5 September 2017 (UTC)[reply]