Talk:R (programming language)
This is the talk page for discussing improvements to the R (programming language) article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: 1, 2, 3Auto-archiving period: 12 months ![]() |
![]() | R (programming language) was a Engineering and technology good articles nominee, but did not meet the good article criteria at the time. There may be suggestions below for improving the article. Once these issues have been addressed, the article can be renominated. Editors may also seek a reassessment of the decision if they believe there was a mistake. | |||||||||
|
![]() | This ![]() It is of interest to multiple WikiProjects. | |||||||||||||||||||||||||||||||||||||||||||
|
![]() | The article Datasets.load was nominated for deletion. The discussion was closed on 24 September 2018 with a consensus to merge the content into R (programming language). If you find that such action has not been taken promptly, please consider assisting in the merger instead of re-nominating the article for deletion. To discuss the merger, please use this talk page. Do not remove this template after completing the merger. A bot will replace it with {{afd-merged-from}}. |
This page has archives. Sections older than 365 days may be automatically archived by Lowercase sigmabot III when more than 5 sections are present. |
Milestones
The table in the Milestones section show R version with format x.y, e.g. R 3.6. However, except for some of the historical releases, the formal version format is x.y.z, e.g. R 3.6.0. The dates associated with each entry appears to point to when the x.y.0 release was done. Should the 'Release' version be updated to use x.y.0 format?
Add .rhistory
.rhistory is another filetype that stores the history of the code executed in a R session. I want to add it to the file types list but I am new to Wikipedia and I don't know how. AHWikipedian (talk) 11:55, 13 November 2023 (UTC)
- I have added that in for you Pansydyke (talk) 16:55, 19 January 2024 (UTC)
Moved Comparison with alternatives/Python to talk
@Newystats: I moved this paragraph to talk:
Comparison with alternatives/Python
Python and R are interpreted, dynamically typed programming languages with duck typing that can be extended by importing packages. Python is a general-purpose programming language while R is specifically designed for doing statistical analysis. Python has a BSD-like license in contrast to R's GNU General Public License but still permits modifying language implementation and tools.[1]
Why is R being compared with Python? Python is a general-purpose programming language, but R is a specific-purpose programming language. This paragraph is comparing an apple with an orange. R_(programming_language)#Interfaces says you can embed R to Python by installing Rpy2. The implication is you can have both full Python and full R.
- You can also embed full Python and other languages in R, as described in Yihui Xie; Joseph J. Allaire; Garrett Grolemund (30 December 2023), R Markdown: The Definitive Guide, Chapman & Hall, Wikidata Q76441281
Regarding Python has a BSD-like license in contrast to R's GNU General Public License but still permits modifying language implementation and tools.
:
- This contrast is immaterial.
- This sentence is the only one that is cited. The book title of the citation is intriguing: "Python vs. R for Data Science." However, the paragraph doesn't paraphrase the book's thesis.
Timhowardriley (talk) 20:32, 3 January 2024 (UTC)
- @Timhowardriley: I object to removing the section on "Comparison with alternatives". If you think it's biased, please propose changes that remove the bias.
- Wikipedia has many comparisons like this that provide a valuable service. Only yesterday I got substantial help with something I was doing from a crudely similar comparison on Wikipedia. In my judgment deleting the entire "Comparison with alternatives" section degrades the quality of this article.
- I'm restoring that entire section including the discussion of Python. I plan to add other material, but I'm not exactly certain what just yet.
- DavidMCEddy (talk) 14:24, 5 January 2024 (UTC)
- Python and R are the two leading programming languages in data science and the comparison is very frequently discussed in relevant sources, so I think it makes sense to include it here. However I agree that the section as it stands is pretty shallow. – Joe (talk) 14:42, 5 January 2024 (UTC)
- Regarding
If you think it's biased, please propose changes that remove the bias.
: Comparisons between products and services are best handled through a table. For a narrative comparison to be unbiased, it requires a lot of words to fairly describe each differentiating characteristic. Most importantly, Wikipedia articles need to be reliably sourced. As Wikipedia editors of this product, we are inherently biased. Instead, a reliable source (like Consumer Reports) needs to compare R with a competitor, then we can paraphrase that material. On the other hand, simply name-dropping the NY Times is misleading. I got past the pay-lock once to read the article. I remember it being very supportive R and having only a mention of SAS. Moreover, it quoted SAS's marketing manage who refuted the SAS disparagements. The Comparison of statistical packages link in the "See also" section is the proper way to compare R with its competitors. RegardingI'm restoring the ... discussion of Python
: Please refute any of my claims that this is a lousy paragraph. Timhowardriley (talk) 23:28, 5 January 2024 (UTC) - Regarding
I plan to add other material, but I'm not exactly certain what just yet.
: The cart is in front of the horse. Wikipedia articles need to be reliably sourced. Step one is to discover something relevant in your secondary research. Step two is to paraphrase that material into the Wikipedia article. Otherwise, it's original research. Timhowardriley (talk) 23:55, 5 January 2024 (UTC)
- Regarding
- Accepted re. deleting the "Comparison with alternatives". Thanks, DavidMCEddy (talk) 16:24, 6 January 2024 (UTC)
Removing the description was "as a programming language to teach introductory statistics at the University of Auckland."
The introduction says "was started by professors Ross Ihaka and Robert Gentleman as a programming language to teach introductory statistics at the University of Auckland." This struck me as a bit odd, as R is not generally considered a tool to teach introductory statistics. Anyway, there's a hyperlink that references this introductory statistics comment. This is what it actually says, with no reference whatsoever to teach introductory statistics.
Early History - 1990
• Ross Ihaka joins the Department of Statistics at the
University of Auckland.
• Robert Gentleman spends 1990 in Auckland on sabbatical
from the University of Waterloo.
• During a chance encounter in the corridor, the following
exchange takes place:
Gentleman: “Let’s write some software.”
Ihaka: “Sure, that sounds like fun.”
• The initial goal is to build a testbed for trying out ideas and to
publish a paper or two.Early History - 1990 Drkirkby (talk) 15:15, 3 February 2024 (UTC)
- The quote from the PDF link is, "We set a goal of developing enough of a language to teach introductory statistics courses at Auckland." I added this quote to the citation. Timhowardriley (talk) 17:39, 3 February 2024 (UTC)
- The "Let's write some software" quote is on page 10 of the PDF. The citation points to page 12 of the PDF. Timhowardriley (talk) 17:45, 3 February 2024 (UTC)
- My error, I missed that.
- I'm having the misfortune of having to learn some introductory statistics with Minitab. I wish we were using R, but the university does not consider R an appropriate language to teach introductory statistics. I can see their point to be honest. Drkirkby (talk) 22:23, 4 February 2024 (UTC)
- The best way to learn statistics is to get a good eraser. ;-) Timhowardriley (talk) 04:38, 5 February 2024 (UTC)
Misuse of print() and return()
In a lot of the code examples the print() and return() functions are used incorrectly. For example R does not require the print(x) function to print the values of a vector x. Simply using the name of the function will do that. There are places where print() should be used (for example in the middle of a function), but not in most of the code shown. This is an important distinction between R and other languages.
Likewise you do not need return() at the end of a function definition. Whatever is on the last line will be returned. You do need to use return() if you are returning from the middle of a function. Again, this is an important distinction between R and other languages.
I would like to edit the examples to reflect this, unless someone has a reason for not doing so. Mcsmom (talk) 15:05, 18 February 2024 (UTC)
- *Regarding
Misuse of print() and return()
: I disagree. print() and return() are not misused. - *Regarding
... the print() and return() functions are used incorrectly.
: I disagree. They are used correctly. - *Regarding
For example R does not require the print(x) function to print the values of a vector x.
: Correct. If x is on a line by itself, then the interpreter will send it to print() for you. - *Regarding
This is an important distinction between R and other languages.
: I disagree. It's a shortcut and not important. - *Regarding
Likewise you do not need return() at the end of a function definition.
: Correct. If the last expression is left unassigned, then the interpreter will return it for you. Indeed, R_(programming_language)#Programmer_created_functions explains this in the comments. - *Regarding
Again, this is an important distinction between R and other languages.
: I disagree. It's a shortcut and not important. - *Regarding
I would like to edit the examples to reflect this...
: I will revert these edits b/c they will confuse a reader not familiar with R's shortcuts. Indeed, when I was new to the language and encountered these shortcuts in code, I was confused. The article's audience is intended to be as broad as possible. However, a new section titled, "Shortcuts" would be an appropriate place to enlighten a reader new to the language. Timhowardriley (talk) 22:43, 18 February 2024 (UTC)
- Additional thought: Just because a language allows for a syntactic construct, it doesn't mean it's wise to use it. For example, COBOL used to allow a function to alter a local variable of another function. See Computer_program#Coupling. Software engineering principles emphasize readability over cryptic syntax constructs. For example, this version of the article has a mistake trying to explain the return() shortcut. The return() is not optional b/c the expression is assigned to the variable z. Timhowardriley (talk) 00:29, 19 February 2024 (UTC)
- ^ Grogan, Michael (2018). Python vs. R for Data Science. O'Reilly Media, Inc.
- Former good article nominees
- B-Class level-5 vital articles
- Wikipedia level-5 vital articles in Technology
- B-Class vital articles in Technology
- B-Class Statistics articles
- High-importance Statistics articles
- WikiProject Statistics articles
- B-Class Computer science articles
- High-importance Computer science articles
- WikiProject Computer science articles
- B-Class Computing articles
- High-importance Computing articles
- B-Class software articles
- Unknown-importance software articles
- B-Class software articles of Unknown-importance
- All Software articles
- B-Class Free and open-source software articles
- High-importance Free and open-source software articles
- B-Class Free and open-source software articles of High-importance
- All Free and open-source software articles
- All Computing articles