Abstract: Data mining isextracts the knowledge/ information from a large amount of data which stores inmultiple heterogeneous data base. Knowledge /information are conveying themessage through direct or indirect.
This paper provides a survey of variousdata mining techniques. These techniques include association, correlation,clustering and neural network. This research paper also conducts a formalreview of the application of data mining such as the education sector,marketing, fraud detection, manufacturing and telecommunication. This paperdiscusses the topic based on past research paper and also studies the datamining techniques. Keywords:Association, Clustering, Datamining, data mining application, knowledge discovery database. I. Introduction Inthe real world, huge amount of data are available in education, medical,industry and many other areas. Such data may provide knowledge and informationfor decision making.
For example, you can find out drop out student in anyuniversity, sales data in shopping database. Data can be analyzed , summarized,understand and meet to challenges.1 Data mining is a powerful concept fordata analysis and process of discovery interesting pattern from the huge amountof data, data stored in various databases such as data warehouse , world wideweb , external sources .Interesting pattern that is easy to understand,unknown, valid ,potential useful. Data mining is a type of sorting techniquewhich is actually used to extract hidden patterns from large databases.
Thegoals of data mining are fast retrieval of data or information, knowledgeDiscovery from the databases, to identify hidden patterns and those patternswhich are previously not explored, to reduce the level of complexity, timesaving, etc 2. Data mining refers extracting knowledge and mining from largeamount of data. Sometimes data mining treated as knowledge discovery indatabase (KDD)3 . KDD is an iterative process, consist a following step shownin Figure1 4. Selection: select data from various resources where operation to beperformed.
Preprocessing: it also known as datacleaning in which remove the unwanted data. Transformation: transform /consolidate into a new format for processing. Data mining: identify the desire result. Interpretation / evaluation: interpret the result/query to give meaningfulreport/information.Variousalgorithms and techniques like Classification, Clustering, Regression,Artificial Intelligence, Neural Networks, Association Rules, Decision Trees,Genetic Algorithm, Nearest Neighbor method etc., are meant for knowledgediscovery from databases 5. The main objective of this paper learns about thedata mining.
II. Data Mining Techniques Datamining means collecting relevant information from unstructured data. So it isable to help achieve specific objectives. The purpose of a data mining effortis normally either to create a descriptive model or a predictive model .Adescriptive model presents, in concise form, the main characteristics of thedata set. The purpose of a predictive model is to allow the data miner topredict an unknown (often future) value of a specific variable; the targetvariable 7. The goal of predictive and descriptive model can be achievedusing a variety of data mining techniques as shown in figure 28.
Figure2 Data Mining Models1.1 Classification:Classification based on categorical (i.e. discrete, unordered). This techniquebased on the supervised learning (i.e. desired output for a given input isknown) .
It can be classifying the data based on the training set and values(class label). These goals are achieve using a decision tree, neural networkand classification rule (IF- Then).for example we can apply the classificationrule on the past record of the student who left for university and evaluatethem. Using these techniques we can easily identify the performance of thestudent. 1.2 Regression:Regression is used to map a data item to a real valued prediction variable 8.In other words, regression can be adapted for prediction.
In the regressiontechniques target value are known. For example, you can predict the childbehavior based on family history. 1.
3 Time Series Analysis:Time series analysis is the process of using statistical techniques to modeland explain a time-dependent series of data points. Time series forecasting isa method of using a model to generate predictions (forecasts) for future eventsbased on known past events 9. For example stock market. 1.
4 Prediction: It is oneof a data mining techniques that discover the relationship between independentvariables and the relationship between dependent and independent variables4.Prediction model based on continuous or ordered value. 1.5 Clustering: Clusteringis a collection of similar data object. Dissimilar object is another cluster.It is way finding similarities between data according to their characteristic.This technique based on the unsupervised learning (i.e.
desired output for agiven input is not known). For example, image processing, pattern recognition,city planning. 1.6 Summarization: Summarization isabstraction of data. It is set of relevant task and gives an overview of data.
For example, long distance race can be summarized total minutes, seconds andheight. Association Rule: Association is themost popular data mining techniques and fined most frequent item set.Association strives to discover patterns in data which are based uponrelationships between items in the same transaction. Because of its nature,association is sometimes referred to as “relation technique”.
This method ofdata mining is utilized within the market based analysis in order to identify aset, or sets of products that consumers often purchase at the same time 6. 1.7 Sequence Discovery: Uncoversrelationships among data 8. It is set of object each associated with its owntimeline of events. For example, scientific experiment, natural disaster andanalysis of DNA sequence.
III. Data Mining Application Various field adapted data miningtechnologies because of fast access of data and valuable information from alarge amount of data. Data mining application area includes marketing,telecommunication, fraud detection, finance, and education sector, medical andso on. Some of the main applications listed below: 1.8 Data Mining in Education Sector:We are applying data mining in education sector then new emerging field called”Education Data Mining”.
Using these term enhances the performance of student,drop out student, student behavior, which subject selected in the course. Datamining in higher education is a recent research field and this area of researchis gaining popularity because of its potentials to educational institutes. Usestudent’s data to analyze their learning behavior to predict the results 10. 1.
9 Data Mining in Banking andFinance: Data mining has been used extensively in the banking and financialmarkets 11. In the banking field, data mining is used to predict credit cardfraud, to estimate risk, to analyze the trend and profitability. In thefinancial markets, data mining technique such as neural networks used in stockforecasting, price prediction and so on.
1.10 Data Mining in Market BasketAnalysis: These methodologies based on shopping database. The ultimate goal ofmarket basket analysis is finding the products that customers frequentlypurchase together. The stores can use this information by putting theseproducts in close proximity of each other and making them more visible andaccessible for customers at the time of shopping 12. 1.11 Data Mining in EarthquakePrediction: Predict the earthquake from the satellite maps. Earthquake is thesudden movement of the Earth’s crust caused by the abrupt release of stressaccumulated along a geologic fault in the interior. There are two basiccategories of earthquake predictions: forecasts (months to years in advance)and short-term predictions (hours or days in advance) 13.
1.12 Data Mining in Bioinformatics:Bioinformatics generated a large amount of biological data. The importance ofthis new field of inquiry will grow as we continue to generate and integratelarge quantities of genomic, proteomic, and other data 4. 1.13 Data Mining inTelecommunication: The telecommunications field implement data miningtechnology because of telecommunication industry have the large amounts of dataand have a very large customer, and rapidly changing and highly competitiveenvironment. Telecommunication companies’ uses data mining technique to improvetheir marketing efforts, detection of fraud, and better management oftelecommunication networks 4. 1.
14 Data Mining in Agriculture: Datamining than emerging in agriculture field for crop yield analysis a withrespect to four parameters namely year, rainfall, production and area ofsowing. Yield prediction is a very important agricultural problem that remainsto be solved based on the available data. The yield prediction problem can besolved by employing Data Mining techniques such as K Means, K nearest neighbor(KNN), Artificial Neural Network and support vector machine (SVM) 14. 1.15 Data Mining in Cloud Computing:Data Mining techniques are used in cloud computing. The implementation of datamining techniques through Cloud computing will allow the users to retrievemeaningful information from virtually integrated data warehouse that reducesthe costs of infrastructure and storage 15.
Cloud computing uses the Internetservices that rely on clouds of servers to handle tasks. The data miningtechnique in Cloud Computing to perform efficient, reliable and secure servicesfor their users. IV. Conclusion This paper provides a general idea ofdata mining, data techniques and data mining in various fields. The mainobjectives of data mining techniques are to discover the knowledge from activedata. These applications use classification, Prediction, clustering,Association techniques and so on.
Hopefully in future work we review variousclassifications and clustering algorithm and its significance’s.