말씀하신대로 적당한 파이썬 시계열 분석 책을 저도 못 찾아서 제가 일할 때 공부했던 방법을 공유드립니다 :)
저는 학부에서 R로 시계열분석을 배웠고, 회사에서 개발할 때 저 책으로 시계열 데이터 전처리를 한 다음에 모형 적합 및 시각화는 statsmodels 패키지 및 구글링으로 처리하기는 했습니다. 어차피 모형 적합은 코드 몇 줄이면 되고 전처리가 대다수라서 저 책으로 시계열 데이터를 익숙하게 핸들링 할 수 있다면, 이론을 이미 아는 상태에서는 원하는 분석을 쉽게 하실 수 있을 거에요 :)
아니면 파이썬으로 금융데이터 다루는 책들을 찾아보셔도 되지 않을까 생각해봅니다 :) 본 적은 없지만, 금융데이터도 시계열 데이터니까요
EDIT: Ok, so to summarize for posterity: Statistical inference by Casella Openintro statistics Statistics by David Freedman All of statistics - wasserman Applied statistics by Montgomery And best of all :...The iron of statistic by Walid Miak
Also some courses (which I haven't checked myself):
Some time ago, there was a discussion on a listserv to which I describe regarding statistical software preference. Someone had mentioned a strong preference for the use of R and since that time, I have downloaded the software package (seeing as how it's freeware). However, in looking at the interface, I am at a loss regarding how to actually use the application, and I currently cannot commit the time necessary to pour through the hundreds of help articles or forums. That being said, I looked into some R tutorial books and I wanted to see if anyone has any experience with the books I have listed below or if there are any other recommendations (the ones listed are based on reviews). I am currently gravitating towards Andy Field's book because his writing style is accessible and entertaining, but I also feel that there may be some "wasted chapters" because I already have the SPSS version of his book and I assume that there will be some redundancy. I am also open to the idea that I might need to buy 2 books.
I will likely be conducting traditional statistical analyses (e.g., factor analysis, discriminant function analysis, MANOVA/MANCOVA, ANOVA/ANCOVA, regression), but I would also like to learn how to conduct other analyses through R (e.g., canonical correlation analysis, structural equation modeling, path analysis, time series analysis, etc). I have not used some of these techniques, so a book that includes didactics regarding the nature of these analyses would also be ideal. I appreciate any insight into this. Thank you for your time and I hope everyone has a nice day.
Discovering Statistics using R (Andy Field, Jeremy Miles, & Zoe Field)
The R Book (Michael J. Crawley)
R Cookbook (Paul Teetor)
R for Dummies (Joris Meys and Andrie de Vries) (they have one of these books for everything, don't they?)
Introductory statistics with R (Peter Dalagard)
R by Example (Use R!) (Jim Albert and Maria Rizzo)
I bought the R Book by M.Crawley and find that it was really helpful. It helps you learning how to use the software but also gives some hints in how to run the stats. I am using it over and over every time I am trying to learn some new analyses! I warmly advice it. I also have the R Graphics book but this doesn't really add much to what you would already find in the R Book, unless you want to do advanced quality graphs.
Thomas, just finished up a stint learning R as I had previous knowledge/experience with SPSS and SAS. Found that once the code and structure of R made since, the language is very good. I used as part of the learning process The Art of R Programming, A Tour of Statistical Software Design by Matloff [ISBN-13: 978-1593273842].
This was a strong intro book to get into R.
What I found was really helpful for seeing how to construct some of the more complex models was using a couple tools, Deducer and R Commander. These are GUI packages that extend R and let you do some pretty good modeling with simple point and click but you can see the code generated which helped me learn good practice for using various functions.
A final thought, while your time may be limited, the forums and help articles do provide an additional component in that that discuss various package extensions for R. The true power of R lies in the fact that anyone can write add on packages to extend functionality and there are some great ones out there.
Thank you everyone for your recommendations and feedback! I will definitely set some time aside in the next couple of weeks to start learning how to use this application. Take care and I hope everyone enjoys the rest of their week.
Dear Thomas, I can only agree with Ivan Maggini: Crawley's The R book picks up right at the very basics, but won't let you out in the rain once you get the stats going. This is probably the only book you will need in a very long time... Good luck getting started! S.
Hi Thomas, I encourage you with either Crawley's or Teetor's; they both nicely cover the very basics and provide some advanced applications. You may also check a course on 'Computing for Data Analysis' atcoursera.org, if you wish to get the basic foundations through interactive e-learning. However, and to wrap up, I would suggest Crawley's if you envision to establish a 'long-term relationship' with R. All the best,
It comes with a book written by its main developer and is very suitable for getting an overview of a new dataset. After a session you can see the equivalent R code the Actions on the UI have produced.
Here is a link to a number of books, videos, and guides for learning various aspects of R. This includes data management, statistics,ans visualization.
I found "Discovering Statistics using R" (Andy Field, Jeremy Miles, & Zoe Field) quite helpful, particularly if you need thorough explanations of statistics as well as R programming. The book usually gives very detailed step-by-step instructions of how to perform a test using R, as well as a lot of explanations on the background behind statistical tests. That said, it does contain some errors and inconsistencies, and I usually double-check the information with more reliable sources, depending on the topic. Particularly, for mixed models I recommend Pinheiro and Bates: "Mixed-Effects Models in S and S-PLUS" (as R is basically a further development of S, you can use the same code for R).
I discover in R a nice tools about packages. Instead of trying to learn everything right away, another option would be to learn directly packages that can provides you with a quick hand on tools and then follow with more deeper understanding on your way.
Also be aware that depending of your areas of interest and applications someone would already created a package that you can just apply to your problem.
And the nice thing about R, is that all packages are required to come with the package explanation book who is a nice place to learn about the package and also the function attributes.
Hope you will enjoy learning packages use in R.
this would be a nice place to start looking about Time series packages and it use
Brian Everett's Handbook of Statistical Analysis was where I began to get comfortable with R. I'd also recommend looking at the Journal of Statistical Software, a free online journal, which describes R packages with tutorials on their use.
Just to add some (hopefully) helpful context. My R book is basically the SPSS book but for R, so the examples are the same as is a lot of the theory. Having said that because R is such a different programme to SPSS, there are a lot of differences in approach/structure. The similarities can be good - in that you can replicate the examples that you know in SPSS but using R. As a learning tool this might be useful. It might also be a lot of pointless redundancy - depends how you look at it -. Different people will see it as a plus or a minus I suspect. Otherwise, I think Crawley's R book is very good and thorough, the website quick R is also great. R for dummies is extremely good for getting to grips with the R interface and manipulating data etc - it's probably he best book i have seen for this- but covers less applied stats as you might expect. I'm not familiar enough with the other books to comment.
Thank you again everyone for the helpful advice, perspectives, and recommendations! It looks like I'll be going through some of the free materials and buying a couple of different books. Cheers!
Andy Field wrote: "My R book is basically the SPSS book but for R, so the examples are the same as is a lot of the theory."
If that is so, that book would be worth looking into. The SPSS book is probably the most pleasant statistics book I've read and I learned a lot from it.
Hi Mitchell and Phillip: thanks for this answer. I had a look at some of the chapters (free download compare link below for chapter 1 from cran r). Is that similar to the textbook?
It's similar in that it covers some basics. The book has a lot more explanations. For example, it starts off with an extensive review of the help functions across mac, PC, and linux. Although the information in the link you cite is accurate, the book's more designed to get you up and running quickly with a lot of explanations along the way. It's a little like having someone thoroughly explain the interface. I think it's worth the money (I just ordered it as a Nook book recently).
Thanks Phillip, sounds really good. Please tell me more when you have the book. I had a download link this morning but unfortunately my university does not support that database otherwise I would owe it now :(
One more thing, for a more advanced user who already knows the basic operations: I learned *a lot* of R just by reading the fabulous manuals, reference manuals and studying the provided examples. Also, many packages contain vignettes or manuals, which are often v.v. good (in fact, many of them with time turned into actual books). Use the "?" and "??" from R command line a lot.
Hi January, thanks for that tipps. Actually I use the manuals as my first reference, as second the blocks. But yours sound better, I bookmark both (I just googled). What I now learned from you are two things: the "?" and the Cookbook. I had a look at it, it looks good. Thank you so much. I also looked at Mark Gardener: beginning R.
To start with i would consulate An Introduction to R which can be found athttp://cran.r-project.orgits free and gives you everything you need to get started. I would the suggest you move on tothe R Cookbook by Paul Teetor its a good guide but also acts as a good reference guide even for advanced users.
Also the guides on the R site can be a bit hit or miss but some are excellent.
I also like the books from Pfaff and his procedures, just for those who seek more alternatives :) Also some universities have an R team as for example ETH Zurich or Institute for Statistics of University Bern. So much from my side.
I know what tour feeling is like. I've been through it too. R is incredible and very versatile but at the "first date" it looks a bit cryptic. Personally, the 'R Book' is well done because example of scripts and, above all, explanations about the R outcome, which is not to underestimate! I reckon that book is a good starting point. Based on the aim of your analysis, probably you will need more reference from either other books or the R packages manuals. It's hard at the beginning but do not give up!
otherwise the R book by Crawley is great. Plus you can learn so much from all the resources online, esp stackoverflow. The atmosphere can be a little hostile sometimes towards new users, but as long as you demonstrate that you've tried some things, have done some reading and give reproducible code you're covered!
You can refer the guidance document of 'Biodiversity R'. It has got some advanced techniques. Also have a look at 'Applied Spatial Data Analysis with R' by Roger S. Bivand . Edzer J. Pebesma.
Considering the coverage you are looking for, I recommend "Numerical Ecology with R", by Daniel Borcard, François Gillet and Pierre Legendre, published in 2011 in the series "Use R!", Springer, XI + 306 pp. The examples are mainly from ecology, but the book leads you step by step through the application of most major techniques of multivariate data analysis. Seehttp://adn.biol.umontreal.ca/~numericalecology/
Many thanks, Sarah-Jo. It was helpful - R for SAS users, exactly what I needed! I rather use Google and other Internet possibilities than books. Books are expensive!
If you are using R outside of the world of statstics, I would recommend "The Art of R Programming" by Norman Matloff as a good reference for writing much more computationally and memory efficient R code.http://nostarch.com/artofr.htm
I would think Andy Field's text matches what you're looking for pretty exactly. You can always skip the bits you read in his SPSS version - I find there's lots I skip in his writing anyway :-)
Then it is a matter of reading the manuals of particular Packages you would instal when wanting to do something specific. that documentation which comes with R packages usually offer usefull examples.
Well, I see a plenty of extremely helpful suggestions here. But I would like to share my experience as a beginner of R during August 2011. The only things you need to learn as a beginner of R are:
1. The R operators.
2. The R object types and how to generate, coerce and exchange between them.
3. The R functions and how to write them with arguments.
And to learn them you don't need any book, they are well documented in "An Introduction to R" (http://www.r-project.org/) (someone has mentioned it already). The application of R became so diversified and out-reaching that you might only need book to learn very specific application oriented R programing. But what I do is typing in google what exactly I need to do in R. Believe me or not there are 100s of webpages waiting to help you and that yields far better results than digging into a book.
All the books mentioned above are really helpful but I do find the R book by Michael Crawley a real treasure. Not only it is helpful in learning R but it has also helped me get valuable insight on some statistical concepts. It is updated with some of the newest concepts in classification and data mining too.
Two books that illustrate how to use R when using ANOVA, MANOVA, ANCOVA and various regression methods are Wilcox (2012, Modern Statistics for the Social Sciences) and Wilcox (2012, Introduction to Robust Estimation and Hypothesis Testing). A possible appeal of these books it they also include modern robust methods that can substantially increase power when standard assumptions are violated.
To a beginner what I am suggesting is to start with R Commander package with R. Since this is menu driven this will act as a bridge from earlier software that you used to R. Using this package you can perform many basic statistics. Then use Quick R website (http://www.statmethods.net/) to understand some basic codes. In this stage one can read other relevant R books to understand the advanced features of R.
Duda please look at German Rodriguez's Introducing R athttp://data.princeton.edu/R. It simplifies R to the benefit of a beginner. It is one the materials that helped me conquer R.
If you already have experience managing data sets and doing statistical analysis in SAS or SPSS, examine the book "R for SAS and SPSS Users" by Robert Muenchen. He also wrote one for STATA users. Then get the book for you application, such as MANOVA.
I notice you also mention that you found the R "interface" a bit intimidating and that it was difficult to figure "how to actually use the application" ! You might find that RStudio (http://www.rstudio.com/ide/download/) helps you get over that obstacle. No doubt R gurus would spurn it in favor of Emacs (e.g.http://ess.r-project.org/) or some even plainer text editor, but it does make things much easier for a beginner, and is much more similar to programs you are familiar with such as SPSS and SAS.
I highly recommend Visualize This by Nathan Yau. Both this book and the author's blog, FlowingData contains lots of tutorials about using R in order to do some good statistics. Check a look at the blog and then decide! Cheers!!!
Yet another useful book is Using R for Introductory Statistics by John Verzani.
For more depth (regarding statistical methods) I recommend the "MASS" book (Modern Applied Statistics with S) by Ripley and Venables. (The S in the title refers to the language; the book is intended for both of its main implementations, the programs Splus and R.)
Note also that many R programs are accompanied by detailed instructions and papers with tutorials.
I'd agree that "Statistics. An introduction using R" by M. Crawley is very useful, both to learn R and understand statistics. It explains the fundamentals of the statistics and walks you through the R code.
Rstudio is a good interface (GUI), and R in Action (Kabacoff,R) and A Handbook of Statistical Analysis Using R (Everitt,BS; Hothorn, T) are excelent books.
Apart the books available in the R website (http://www.r-project.org, manual section), I started my adventure with R with the very useful Peter Dalgaard's book.
"Introductory Statistics with R" - Springer Editor
It will guide you from the basics of R and statistics until more advanced analysis.
Learning R is about practice, searching, trial and error. When you encounter a problem, Google is often the first choice. You will find answers quite often inhttp://stackoverflow.com/.
For the books, I think R in Action is a great reference, not only for statistics but also for data visualization. The book is systemically written and well-organized. The content covers the basic statistics and intermediate methods such as regression, permutation tests, generalized linear model, PCA, and dealing with missing data. At the same time, its companion website is also very useful:http://www.statmethods.net/. If you have already been familiar with the basic statistics, I think it's a nice start for you to practically learn R and use it!
And I'm still getting great recommendations! Thanks everyone so much for your time in responding to my question. Learning R will be one of my primary projects over winter break. Thank you again! :)
As one of the authors of R for Dummies, I'm bound to suggest that one to you as well. But I'd like to add a sidenote: R for Dummies looks at R from a programming point of view, not so much a statistical point of view.
We chose to take "the other route" as I have daily experience with the problems that arise due to copy-pasting solutions from other people without understanding the underlying structure of the objects and how to work with them. Yet, as R _is_ first and foremost a programming (scripting) language, you need a fair idea about how to work with the objects.
I get R users at my desk that even with more than 3 years of experience still don't know eg that a data frame is a list and not a matrix, and especially don't grasp the consequences of this fact.
As I noted to some critics before, everything you need to learn R is to be found for free on the internet. R for Dummies is merely a (hopefully useful) summary in a sequence we deemed suited to learn R from scratch.
But whatever you do, don't copy code you don't understand, and spend a fair amount of time figuring out the programming aspects, not only the statistical aspects of R.
I strongly recommend 'Using R as an Introductory Statistics' by John Verzani. I used it when learning R and it provided me with strong basis. Very good to teach you the R language and stats at the same time.
Georgia Southern University, Jiann-PIng Hsu College of Public Health
Try Clinical Trial Data Analysis with R by Din Chen and Karl Peace, published by Chapman/ Hall Biostatistics Series. You may also want to consider Applied Meta Analysis Using R, also by Chen and Peace and published by Chapman Hall Biostatistics Series.
I have used the Daalgard book, and I find it to be very helpful. Another book is "R in a Nutshell", by Joseph Adler, is a helpful reference, but don't expect to learn R from it.
Do any of the books have explanations with examples for things like generating permutation distributions or even MCMC methods?
Hi there! I discovered R by taking the Statistics, Data Analysis and Computing for Data Analysis classes onwww.coursera.org. I think the interactions and also the course materials and resources (some of which named above) would add value and more depth to your endeavour rather than only taking a book page by page. Good luck with your work!
If you're already comfortable with the statistics then I would not recommend Andy Field's book because (a) it focuses primarily on the statistics, spending much time (i.e., pages) on trying not to scare students away, and (b) it does not introduce R in the easiest possible way but tries to adapt R usage to the requirements of an SPSS stats book, resulting in examples that may start off scarier than necessary. I prefer a more bare-bones initial approach R, with a minimum of functions and external libraries, focusing on how simply and coherently you can get basic stuff done.
I concur with recommendations for online introductions, such as tutorials marked "for psychologists" and such in the "under 100 page" section of the R contributed documentation pages.
Having said that, I do recommend Field's book for someone who also needs to learn the stats starting at the beginning, for the well-known reasons that have made Field's book so popular with students.
Ministero dell'Istruzione, dell'Università e della Ricerca
For time series analysis I suggest you the book of Shumway and Stoffer. For regression the newest book of Fahrmeir et al, "REGRESSION", which has a lot of updated example in R, STATA and other packages. For simple programmingwww.datamind.org.
Thomas, if you haven't already, I would recommend downloading R-Studio which is a popular 'integrated development environment'. It includes lots of features that make using R easier including adding in additional packages which is a common task.
I would recommend r in a nutshell by Adler and intro stat with R by Dalgaard. Both are so helpful. QuickR website is also a good source for elementary level.
Murray Logan's Book (Practical Design and Analysis Using R: A practical guide) is fine to began an introduction to R. For multivariate analysis (PCA, CCA, RDA,...) I can suggest you try the website of ade4 package, but the problem, may be, it is in French. However, there is the adelist, that is a mailing list used to announce news about the ade4 package for R, and to allow users to exchange informations. For the time serie analysis, you have Woods' Book on Generalized Additive Models in R. You have also a R-tutorial, of ~20 pages, about the time series analysis with R (Zucchini and Nemadé, Time series analysis with R - part I). You can go to see also athttp://a-little-book-of-r-for-time-series.readthedocs.org/en/latest/src/timeseries.html.
R has a tremendous number of resources you can use. In this sense I suggest to go to the Contributed Documentation in the CRAN website (see Manuals\Contributed Documentations at bottom of the page):
Here you can find surely guide for the majority of the statistical techniques you are planning to use. Please consider that sometime you can need some other tutorials or guides so my suggestion is to be aware on the powerful search engines which allow to find statistical techniques of interest. So you can use:
Last but not least you can use from the package "sos" a function: findFn which allow to search of the method (for example) in the various package it is possible to install.
Uniformed Services University of the Health Sciences
Dear Thomas, I second Xuanlong's recommendation for the "Intro to R tutorial". It summarizes very important basics. There is a Youtube video that covers the Intro to R at
With just these basics behind you, and as with any programming language, the best way to learn is to start programming on a problem that interests you. Regardless of what platform you use, you should have two windows open, at least, one interactive and one text editor. This can be done many ways: Rstudio to emacs... Use the manual: "?plot", "?randomForest", etc. Every manual page has one or more examples that you can run. This, in my opinion, is the best text.