My blog: 2013

Saturday, 30 March 2013

IT Lab Session 10

plot3d(T)

plot3d(T, col=rainbow(1000))

plot3d(T, col=rainbow(1000), type='s')

x<-rnorm(100,50,15)
y<-rnorm(100,70,20)
Colours<-c("B","G","R","Y","P")
z<-sample(Colours,100,replace=TRUE)
T<-cbind(x,y,z)

qplot(x,y)

qplot(x,y,alpha=I(1/20))

qplot(x,y,colour=z)

qplot(x,y,geom=c("point","smooth"))

Sunday, 24 March 2013

Data visualization works so well because the human brain is extremely well-equipped to process visual information. We can capture patterns and essential themes in huge data sets very, very quickly through visual means.

Unfortunately, the tools to create these visual representations are usually too expensive and difficult for smaller news organizations and everyday citizens to use, creating a gap for the future of community journalism. With the generous support of the Knight Foundation, we created VIDI, a suite of powerful intuitive Drupal data-visualization modules for anyone to use on any standard set of data ranging from government databases to demographics and statistics.

VIDI gives you easy-to-use tools to build and embed colorful and dynamic maps for your presentations or your blogs. It includes powerful and intuitive tools for expressing data through maps.

Steps to use VIDI data visualization tools are as follows

Ø Find your story

Ø Pursue an appropriate dataset for your story (blog, Twitter, Facebook or any website)

Ø Find an adequate data visualization type to support your story

Ø Prepare and if necessary fix your datasets, if not sure check datasets formats

Ø Explore color scheme suggestions

Ø Register or log-in to be able to upload and map your datasets

Ø Copy embed link to embed your data visualizations into your content

Ø Come back any time to use your visualizations saved on My VIDI page

Different types of data visualization formats VIDI supports able to generate from given data are as follows

v Motion Chart

v Timeline map

v Timeline map with path

v Annotated timeline

v Tagmap

v Intesity map

v Geo map

v Timeline map

v Timeline map with path

v Pie chart

v Gauge chart

v Bar chart

v Area chart

v Column chart

v Line chart

v Scatter chart

Some sample charts that can be created in VIDI are as follows

Conclusion:

VIDI provides all the basic charts required with a very easy user interface. Even a beginner can easily use it and can create interactive charts required for him. But higher level customizations are not available at this stage. VIDI can work on this and can improve on it. And right now it accepts data in three file formats only. It can extend its allowable file formats and can accept data in the web page only as another option to be very much user friendly to user.

Friday, 15 March 2013

IT Lab Session 8

Analyze the data set "Produc" in "plm" package

Pooled Affect Model

pool<-plm(log(pcap)~log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc, model=("pooling"), index = c("state", "year"))

Fixed Affect Model

fixed<-plm(log(pcap)~log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc, model=("within"), index = c("state", "year"))

Random Affect Model

random<-plm(log(pcap)~log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc, model=("random"), index = c("state", "year"))

pFtest

We cannot reject null hypothesis, so fixed is better

plmtest

We cannot reject null hypothesis, so pool is better

phtest

We should null hypothesis, so fixed is better

So by consolidating all the results, fixed affect model is better for "Produc" data

Wednesday, 13 February 2013

IT Lab Session 6

1) Find the log of returns data and volatility

2) Create ACF plot of log returns and do Augmented Dickey-Fuller test

Thursday, 7 February 2013

IT Lab Session5

Q1) Download data set for large NSE data (atleast 6 months) and generate returns having selected the 10th datapoint as start and 95th data point as end

Ans)

Q2) Predict the data for 701 to 850 rows for the data given
Ans) Considered "age" and "ed" attributes as categories and worked accordingly

> logit<-read.csv(file.choose(), header=T)
> logit.eg<-logit[1:700,]
> logit.eg$age <- factor(logit.eg$age)
> logit.eg$ed <- factor(logit.eg$ed)
> logit.est <- glm(default~age+ed+employ+address+income+
+ debtinc+creddebt+othdebt, data = logit.eg ,
+ family = "binomial" )
> logit.eg2<-logit[701:850,]
> logit.eg2$age <- factor(logit.eg2$age)
> logit.eg2$ed <- factor(logit.eg2$ed)
> logit.eg2$prob <- predict(logit.est, newdata = logit.eg2, type = "response")
> logit.eg2