Data Manipulation Part 1 : SQL
May 7th, 2010
No comments
Data manipulation is a big part of a data mining process. Some authors claims it could take 80% of a data mining project. I could only agree. If data comes from the data warehouse it could be a lot faster. If you have to dig (and understand) operational systems or adding some externals data the works takes even more time. Therefore it is of greatest importance to be efficient in data manipulation. Currently I use two way to do this task : big SQL queries or ETL depending on the situation.