วันจันทร์ที่ 2 กันยายน พ.ศ. 2556

Data management

Data management

From Wikipedia, the free encyclopedia

Overview[edit source | editbeta]

The official definition provided by DAMA International, the professional organization for those in the data management profession, is: "Data Resource Management is the development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs of an enterprise." {{DAMA International}} This definition is fairly broad and encompasses a number of professions which may not have direct technical contact with lower-level aspects of data management, such as relational database management.
Alternatively, the definition provided in the DAMA Data Management Body of Knowledge (DAMA-DMBOK) is: "Data management is the development, execution and supervision of plans, policies, programs and practices that control, protect, deliver and enhance the value of data and information assets."[1]
The concept of "Data Management" arose in the 1980s as technology moved from sequential processing (first cards, then tape) to random access processing. Since it was now technically possible to store a single fact in a single place and access that using random access disk, those suggesting that "Data Management" was more important than "Process Management" used arguments such as "a customer's home address is stored in 75 (or some other large number) places in our computer systems." During this period, random access processing was not competitively fast, so those suggesting "Process Management" was more important than "Data Management" used batch processing time as their primary argument. As applications moved more and more into real-time, interactive applications, it became obvious to most practitioners that both management processes were important. If the data was not well defined, the data would be mis-used in applications. If the process wasn't well defined, it was impossible to meet user needs.

Corporate Data Quality Management[edit source | editbeta]

Corporate Data Quality Management (CDQM) is, according to the European Foundation for Quality Management and the Competence Center Corporate Data Quality (CC CDQ, University of St. Gallen), the whole set of activities intended to improve corporate data quality (both reactive and preventive). Main premise of CDQM is the business relevance of high-quality corporate data. CDQM comprises with following activity areas:[2]
  • Strategy for Corporate Data Quality: As CDQM is affected by various business drivers and requires involvement of multiple divisions in an organization; it must be considered a company-wide endeavor.
  • Corporate Data Quality Controlling: Effective CDQM requires compliance with standards, policies, and procedures. Compliance is monitored according to previously defined metrics and performance indicators and reported to stakeholders.
  • Corporate Data Quality Organization: CDQM requires clear roles and responsibilities for the use of corporate data. The CDQM organization defines tasks and privileges for decision making for CDQM.
  • Corporate Data Quality Processes and Methods: In order to handle corporate data properly and in a standardized way across the entire organization and to ensure corporate data quality, standard procedures and guidelines must be embedded in company’s daily processes.
  • Data Architecture for Corporate Data Quality: The data architecture consists of the data object model - which comprises the unambiguous definition and the conceptual model of corporate data - and the data storage and distribution architecture.
  • Applications for Corporate Data Quality: Software applications support the activities of Corporate Data Quality Management. Their use must be planned, monitored, managed and continuously improved.

Data integrity

Data integrity

From Wikipedia, the free encyclopedia
In computing, data integrity refers to maintaining and assuring the accuracy and consistency of data over its entire life-cycle,[1] and is an important feature of a database or RDBMS system. Data integrity means that the data contained in the database is accurate and reliable. Data warehousing and business intelligence in general demand the accuracy, validity and correctness of data despite hardware failures, software bugs or human error. Data that has integrity is identically maintained during any operation, such as transfer, storage or retrieval.
All characteristics of data, including business rules, rules for how pieces of data relate, dates, definitions and lineage must be correct for its data integrity to be complete. When functions operate on the data, the functions must ensure integrity. Examples include transforming the d

Databases[edit source | editbeta]

Data integrity contains guidelines for data retention, specifying or guaranteeing the length of time of data can be retained in a particular database. It specifies what can be done with data values when its validity or usefulness expires. In order to achieve data integrity, these rules are consistently and routinely applied to all data entering the system, and any relaxation of enforcement could cause errors in the data. Implementing checks on the data as close as possible to the source of input (such as human data entry), causes less erroneous data to enter the system. Strict enforcement of data integrity rules causes the error rates to be lower, resulting in time saved troubleshooting and tracing erroneous data and the errors it causes algorithms.
Data integrity also includes rules defining the relations a piece of data can have, to other pieces of data, such as a Customer record being allowed to link to purchased Products, but not to unrelated data such as Corporate Assets. Data integrity often includes checks and correction for invalid data, based on a fixed schema or a predefined set of rules. An example being textual data entered where a date-time value is required. Rules for data derivation are also applicable, specifying how a data value is derived based on algorithm, contributors and conditions. It also specifies the conditions on how the data value could be re-derived.

Types of integrity constraints[edit source | editbeta]

Data integrity is normally enforced in a database system by a series of integrity constraints or rules. Three types of integrity constraints are an inherent part of the relational data model: entity integrity, referential integrity and domain integrity:
  • Entity integrity concerns the concept of a primary key. Entity integrity is an integrity rule which states that every table must have a primary key and that the column or columns chosen to be the primary key should be unique and not null.
  • Referential integrity concerns the concept of a foreign key. The referential integrity rule states that any foreign-key value can only be in one of two states. The usual state of affairs is that the foreign key value refers to a primary key value of some table in the database. Occasionally, and this will depend on the rules of the data owner, a foreign-key value can be null. In this case we are explicitly saying that either there is no relationship between the objects represented in the database or that this relationship is unknown.
  • Domain integrity specifies that all columns in relational database must be declared upon a defined domain. The primary unit of data in the relational data model is the data item. Such data items are said to be non-decomposable or atomic. A domain is a set of values of the same type. Domains are therefore pools of values from which actual values appearing in the columns of a table are drawn.
If a database supports these features it is the responsibility of the database to insure data integrity as well as the consistency model for the data storage and retrieval. If a database does not support these features it is the responsibility of the applications to ensure data integrity while the database supports the consistency model for the data storage and retrieval.
Having a single, well-controlled, and well-defined data-integrity system increases
  • stability (one centralized system performs all data integrity operations)
  • performance (all data integrity operations are performed in the same tier as the consistency model)
  • re-usability (all applications benefit from a single centralized data integrity system)
  • maintainability (one centralized system for all data integrity administration).
As of 2012, since all modern databases support these features (see Comparison of relational database management systems), it has become the de-facto responsibility of the database to ensure data integrity. Out-dated and legacy systems that use file systems (text, spreadsheets, ISAM, flat files, etc.) for their consistency model lack any[citation needed] kind of data-integrity model. This requires organizations to invest a large amount of time, money, and personnel in building data-integrity systems on a per-application basis that effectively just duplicate the existing data integrity systems found in modern databases. Many companies, and indeed many database systems themselves, offer products and services to migrate out-dated and legacy systems to modern databases to provide these data-integrity features. This offers organizations substantial savings in time, money, and resources because they do not have to develop per-application data-integrity systems that must be re-factored each time business requirements change.

Examples[edit source | editbeta]

An example of a data-integrity mechanism is the parent-and-child relationship of related records. If a parent record owns one or more related child records all of the referential integrity processes are handled by the database itself, which automatically insures the accuracy and integrity of the data so that no child record can exist without a parent (also called being orphaned) and that no parent loses their child records. It also ensures that no parent record can be deleted while the parent record owns any child records. All of this is handled at the database level and does not require coding integrity checks into each applications.

File Systems[edit source | editbeta]

Research shows that neither currently widespread File systems — such as UFS, Ext, XFS, JFS, NTFS — nor Hardware RAID solutions provide sufficient protection against data integrity problems.[2][3][4][5][6] ZFS addresses these issues and research further indicates that ZFS protects data better than earlier solutions.[7]
ata, storing history and storing metadata.

Datasheet

Datasheet

From Wikipedia, the free encyclopedia
Jump to: navigation, search
A floppy disk controller datasheet.
A datasheet, data sheet, or spec sheet is a document summarizing the performance and other technical characteristics of a product, machine, component (e.g. an electronic component), material, a subsystem (e.g. a power supply) or software in sufficient detail to be used by a design engineer to integrate the component into a system. Typically, a datasheet is created by the component/subsystem/software manufacturer and begins with an introductory page describing the rest of the document, followed by listings of specific characteristics, with further information on the connectivity of the devices. In cases where there is relevant source code to include, it is usually attached near the end of the document or separated into another file.
Depending on the specific purpose, a data sheet may offer an average value, a typical value, a typical range, engineering tolerances,or a nominal value. The type and source of data are usually stated on the data sheet.
A data sheet is usually used for technical communication to describe technical characteristics of an item or product. It can be published by the manufacturer to help people choose products or to help use the products. By contrast, a technical specification is an explicit set of requirements to be satisfied by a material, product, or service.
An electronic datasheet specifies characteristics in a formal structure that allows the information to be processed by a machine. Such machine readable descriptions can facilitate information retrieval, display, design, testing, interfacing, verification, and system discovery. Examples include transducer electronic data sheets for describing sensor characteristics, and Electronic device descriptions in CANopen or descriptions in markup languages, such as SensorML.

Datacable

Data cable

From Wikipedia, the free encyclopedia
Jump to: navigation, search
A data cable is any media that allows baseband transmissions (binary 1,0s) from a transmitter to a receiver.
Examples Are:

What is data???

Data

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Data (/ˈdtə/ DAY-tə, /ˈdætə/ DA-tə, or /ˈdɑːtə/ DAH-tə) are values of qualitative or quantitative variables, belonging to a set of items. Data in computing (or data processing) are represented in a structure, often tabular (represented by rows and columns), a tree (a set of nodes with parent-children relationship) or a graph structure (a set of interconnected nodes). Data are typically the results of measurements and can be visualised using graphs or images. Data as an abstract concept can be viewed as the lowest level of abstraction from which information and then knowledge are derived. Raw data, i.e., unprocessed data, refers to a collection of numbers, characters and is a relative term; data processing commonly occurs by stages, and the "processed data" from one stage may be considered the "raw data" of the next. Field data refers to raw data collected in an uncontrolled in situ environment. Experimental data refers to data generated within the context of a scientific investigation by observation and recording.
The word data is the plural of datum, neuter past participle of the Latin dare, "to give", hence "something given". In discussions of problems in geometry, mathematics, engineering, and so on, the terms givens and data are used interchangeably. Such usage is the origin of data as a concept in computer science or data processing: data are numbers, words, images, etc., accepted as they stand.
Though data is also increasingly used in humanities (particularly in the growing digital humanities), it has been suggested that the highly interpretive nature of humanities might be at odds with the ethos of data as "given". Peter Checkland introduced the term capta (from the Latin capere, “to take”) to distinguish between an immense number of possible data and a sub-set of them, to which attention is oriented.[1] Johanna Drucker has argued that since the humanities affirm knowledge production as “situated, partial, and constitutive,” using data may introduce assumptions that are counterproductive, for example that phenomena are discrete or are observer-independent.[2] The term capta, which emphasizes the act of observation as constitutive, is offered as an alternative to data for visual representations in the humanities.