Jump to content

User talk:IntangibleMan

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Welcome!

[edit]
A plate of chocolate chip cookies.
Welcome!

Hello, IntangibleMan, and welcome to Wikipedia! I hope you like the place and decide to stay. Below are some pages you might find helpful. For a user-friendly interactive help forum see the Wikipedia Teahouse.

I hope you enjoy editing here and being a Wikipedian! Please sign your name on talk pages using four tildes (~~~~); this will automatically produce your name and the date. If you need help, please see our help pages, and if you can't find what you are looking for there, please feel free to ask me on my talk page or place {{Help me}} on this page and someone will drop by to help. Again, welcome! Shadestar474 (talk) 07:44, 17 November 2023 (UTC)[reply]

Hi there, I just saw your edit to Vector database - I believe this is true, but could you add a citation for that? Especially since the article is currently a stub, quality Reliable Sources would be a great improvement to the article. Thanks! StereoFolic (talk) 14:51, 17 November 2023 (UTC)[reply]

Hi StereoFolic! Thanks for your comment. This is fact is mentioned in the documentation of the DBs listed in the article (and in the source code of the open source ones) -- see Qdrant, Pinecone, or Elasticsearch mentioning HNSW or ANN algorithms in general. However, I'm don't know if these rise to the level of a WP:RS since they are primary sources with commercial motivation to market their own product.
It's mentioned in passing in this VentureBeat article, though not the main focus of the article (most third-party articles on the area seem to be marketing/VC focused). Academic sources are mostly focused on ANN algorithms, rather than vector DBs, which are a productization/commercial phenomenon.
I'm a new editor, so happy to defer to your judgement about what's best here and what's considered reliable by the CS/tech editor community. Thanks!
---
P.S. Would also appreciate your thoughts on the article's list of vector DBs. To me, it's a bit overfocused on startup commercial offerings, possibly due to the availability of sources. But for VC-adjacent sources like techcrunch or venturebeat, the focus of the article is how much money the company raised and who they raised it from, rather than the importance or popularity of the product. So I am not sure that they demonstrate notability. This is a fairly new area, maybe readers trying to find a vector DB would be better served by a search engine than WP until the field is more well-established? IntangibleMan (talk) 18:47, 17 November 2023 (UTC)[reply]
I think both the Pinecone and Elasticsearch sources would be good citations to your addition. We can consider these fairly reliable sources to the extent that they discuss the algorithms and database types in general, less so for claims about their own products. Ideally we would have articles and research papers by independent journalists and researchers, but this is often not available in rapidly developing technologies. Given the not-entirely-independent nature of the sources, I would use both as citations to help further establish reliability.
Regarding the article list - it's very arbitrary and incomplete right now. You may be interested in reading the guidelines at Stand-alone lists. Personally with tech lists I usually follow the rule of thumb that entries should have at least one reliable source discussing it not-in-passing. Fundraising news articles typically count (unless they're simply press release reprints), since they demonstrate some degree of notability and often include some discussion of the products, but yes higher-quality sources would go beyond this. Please feel free to add to or revise the list there! StereoFolic (talk) 15:56, 18 November 2023 (UTC)[reply]
If you would like to open a discussion at Talk:Vector database about whether the list is worth including, I would be happy to participate. Probably it makes the most sense to split the list into a separate article "List of Vector Databases", since there's plenty to discuss already about vector databases in a standalone article. This is much more the norm throughout wikipedia. StereoFolic (talk) 15:57, 18 November 2023 (UTC)[reply]
great improvements, thanks a bunch! StereoFolic (talk) 20:05, 20 November 2023 (UTC)[reply]