Talk:CYK algorithm

Computer science C‑class Mid‑importance

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science

C

This article has been rated as C-class on Wikipedia's content assessment scale.

Mid

This article has been rated as Mid-importance on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

I don't like the article

I don't like this article because it gives an example about categorizing NP(noun phrase) VP(verb phrase) and so forth, and this would be useful for someone who wanted to build a POS tagger. However, this article is not about POS taggers, it's about CYK, so examples should be given with that in mind. Just give a grammar, and a word, and have various states of the matrix P for the example. Not that complicated. —Preceding unsigned comment added by 188.26.48.167 (talk) 12:56, 23 January 2011 (UTC)[reply]

Explain practical parse tree generation!

Original threadstarter did not title this thread. Also: linebreaks added to original threadstarter's post. --131.234.234.12 (talk) 11:56, 23 January 2009 (UTC)[reply]
anyone please help
is there any proper algorithm to show that how the string is enabled .
Example:
given a string,the algortithm could give how it comes from the start S, show how many ways it can be ,or even give the shortest way.
—Preceding unsigned comment added by 218.7.20.106 (talk • contribs) 02:32, 21 July 2003

It sounds like you're asking about parse trees... the page mentions that the algorithm is easy to modify to find a parse tree (or several). This would show you how to construct the string from the rules. I *think* that with Chomsky normal form, all derivations are the same length, but I'm not really an expert...

—Preceding unsigned comment added by 24.21.186.8 (talk • contribs) 18:40, 9 July 2004

Give examples!

Original threadstarter did not title this thread. --131.234.234.12 (talk) 11:59, 23 January 2009 (UTC)[reply]
This page is great. However - it would incredible with a few examples to lead readers through the process.
—Preceding unsigned comment added by 209.134.42.169 (talk • contribs) 13:01, 10 May 2007

Another article on CYK

There is another, almost empty article on the CYK parser: http://en.wikipedia.org/wiki/CYK_%28algorithm%29 The two should be merged (the former should be deleted.)
—Preceding unsigned comment added by 89.133.45.42 (talk • contribs) 21:33, 21 June 2007

Done. Ceroklis 10:01, 26 September 2007 (UTC)[reply]

What does C stand for in CYK?

So far, I have come across three different names for letter C:

Cook
Cocke
Coke

Which one is correct?
—Preceding unsigned comment added by 89.133.45.42 (talk • contribs) 21:36, 21 June 2007

Cocke, according to the Jurafsky and Martin book. Ealdent 15:26, 23 October 2007 (UTC)[reply]

Mistake in the Algorithm?

I'm almost sure there is a mistake in the algorithm, but I'll write it here first just in case...
The line:
"Let the grammar contain r terminal and nonterminal symbols R1 ... Rr."
should be:
"Let the grammar contain r nonterminal symbols R1 ... Rr."
There is no need to consider the terminals also.
Sararkd (talk) 02:37, 18 January 2008 (UTC)[reply]

Change algorithm indexing

Is there any objection to changing the algorithm indexing from 1-indexed to 0-indexed? This would make converting from pseudocode to real code much easier. —Preceding unsigned comment added by Jhm718 (talk • contribs) 11:40, 24 April 2009 (UTC)[reply]

Why "Carlos"

Is there any good reason for naming the input string Carlos? It seems silly and irrelevant to me. Sigurdmeldgaard (talk) 08:34, 17 May 2009 (UTC)[reply]

Seems to be corrected - thanks Sigurdmeldgaard (talk) 19:08, 27 May 2009 (UTC)[reply]

Citation from 2009?

Is that paper from 2009 really notable enough to warrant a citation in Wikipedia? It is used to make a fairly technical point. Since all other sources (besides Knuth, which is a reference work) are from the 60s, 70s, it makes the impression as if the latest paper would be the only continuation of the study of this algorithm, which is difficult to believe... that is, IMO it inflates the importance of the paper and the reference should be removed.

While I understand your feelings about giving undue weight to that paper, I would suggest to add more references of recent date about the CYK algorithm. For instance, the CYK algorithm, as well as Valiant's improvement have been recently generalized from context-free grammars to the case of Boolean grammars. Removing content from the article would certainly be a step into the wrong direction, given the size of the article. If there are any recent papers that deserve to be included, please be bold and just go ahead. I am confident that the article will further expanded over time, given that the article's subject is covered in many textbooks and courses on automata theory. Hermel (talk) 21:29, 20 July 2010 (UTC)[reply]

In the previous comment, do you mean the Lange & Leiß paper? I just read that paper, without going deep into the math, and found it incredibly informative, since the main aim is pedagogical.

The paragraph in this article summarizing this paper is weak, and would benefit from being written in more specific terms concerning the three operations (BIN, DEL, UNIT) necessary to transform to CNF, the effect of ordering these operations on size explosion (exponential if DEL precedes BIN), and that DEL and UNIT can be internalized into the CYK algorithm at no extra cost, leaving only a linear transformation on grammar size into 2NF form (BIN). This papers omits TERM because it offers no advantage for the CYK algorithm.

Why so concerned about importance? Clarity often comes at remove from original research and should be valued for its own sake. — MaxEnt 11:27, 3 December 2010 (UTC)[reply]

Yes, the previous comment was about the Lange & Leiß paper. Please, be bold and go ahead if you want to improve that section. Hermel (talk) 14:59, 4 December 2010 (UTC)[reply]

403 Access Forbidden

External Link to "CYK parsing demo in JavaScript" (http://homepages.uni-tuebingen.de/student/martin.lazarov/demos/cky.html) is broken.

A short search didn't result in any alternative location.

I haven't removed the link, since maybe it's a temporary problem (don't think so) and maybe someone else might come up with something. --84.109.241.65 (talk) 05:08, 27 December 2010 (UTC)[reply]

Mistake in the Algorithm?

I think it should be

set P[i,1,j] = true

instead of

set P[i,i,j] = true

if the second index denotes the length of a span. —Preceding unsigned comment added by 93.138.47.91 (talk) 00:58, 7 May 2011 (UTC)[reply]