Draft talk:Voice-First AI

This draft is within the scope of WikiProject Artificial Intelligence, a collaborative effort to improve the coverage of Artificial intelligence on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Artificial IntelligenceWikipedia:WikiProject Artificial IntelligenceTemplate:WikiProject Artificial IntelligenceArtificial Intelligence

Human–Computer Interaction (inactive)

This draft is within the scope of WikiProject Human–Computer Interaction, a project which is currently considered to be inactive.Human–Computer InteractionWikipedia:WikiProject Human–Computer InteractionTemplate:WikiProject Human–Computer InteractionHuman–Computer Interaction

Response to Decline and Request for Re-review

Hi, and thank you for the previous review.

I've substantially revised the draft to address the sourcing concerns raised by User:S0091. Specifically, I have:

- Removed references to company blogs and promotional sites - Replaced those with citations from independent, reliable, and secondary sources (e.g., IEEE Spectrum, Nature npj Digital Medicine, MIT Technology Review, Business Insider, BBC) - Maintained a neutral tone and clarified distinctions between voice-first AI and general voice interfaces

I believe this updated version better meets the notability and sourcing criteria and would appreciate a fresh review when time permits.

Warm regards, ArturoFalck (talk) 19:19, 21 May 2025 (UTC)[reply]

Notes on Possible Merge/Split/Consolidation

This draft might eventually be more appropriate as a merge or split from existing topics such as Chatbot, Conversational user interface, or a future consolidated article on human-computer dialogue. Notes below reflect that possibility and are shared in good faith to support future editorial decisions.

Rationale for a Dedicated Section or Page on Voice-First AI

Voice-first AI refers specifically to systems where spoken input and output are the primary modalities, unlike most chatbots or graphical user interfaces which rely heavily on text or screen-based interaction.
The concept is increasingly relevant in public space infrastructure (e.g., transit help points, emergency kiosks, smart intercoms), accessibility tech, and ambient computing—domains where visual or manual interaction is not practical or intended.
Notable distinctions from general chatbot coverage include:
- Hands-free, eyes-free environments
- Hardware integration (microphones, speakers, edge devices)
- Latency and real-time processing constraints
- Multilingual and accent-robust design considerations
Several academic and industry sources treat “voice-first AI” as a distinct category or subfield within conversational AI.

Relevant Pages for Cross-Linking or Integration

Chatbot – consider expanding coverage of voice interfaces and linking to voice-first use cases.
Conversational AI (draft) – voice-first AI is likely a subtopic or implementation approach under this broader term.
Conversational user interface – overlaps but is broader; focuses on interface design, not just voice-first modality.
Speech recognition and Speech synthesis – useful technical subcomponents, but not covering the broader interaction paradigm.

Source Cleanup Note

As part of addressing reviewer concerns about sourcing, I’ve removed previous citations to arXiv preprints and replaced them with peer-reviewed or published secondary sources. This was done in line with WP:ARXIV, which advises caution when citing preprints that are not yet peer-reviewed.

Replacements include:

A Nature-published article from npj Digital Medicine (Patel & Jones, 2021) in place of the arXiv HERMES kiosk study.
A peer-reviewed Springer book (McTear, 2020) on Conversational AI, replacing the earlier arXiv multimodal AI survey.

Thanks again for your guidance — happy to continue improving based on further feedback. — ArturoFalck (talk) 15:02, 24 May 2025 (UTC)[reply]