SAIMM - Artificial Intelligence in the preparation of scientific documents

Artificial Intelligence in the preparation of scientific documents

Details: Created: Monday, 09 September 2024 13:23; Written by W.C. Joughin

WC Joughin 25072023 This is my final President’s Corner article and, I must admit, I feel a sense of relief that it is coming to an end. Initially, the prospect of writing eleven articles, one each month, was quite daunting. While I have extensive experience writing technical consulting reports, and research articles and case studies for conferences and journals—all of which have a clear focus, I have not written many general articles. I therefore decided to experiment with Artificial Intelligence (AI) tools to see if it could help me to produce these articles.

ChatGPT burst onto the scene in November 2022, introducing the concept of a Large Language Model (LLM) to the world. Until then, chatbots with AI were quite disappointing, but ChatGPT could answer questions sensibly and generate well-constructed sentences and paragraphs very rapidly. Being freely available and easy to use, it became widely used within a short time. LLMs are computational models capable of generating natural language. They are trained using machine-learning techniques on vast quantities of text data sourced from the internet and books. This makes them extremely powerful tools, capable of producing text in multiple languages and even generating code for computer programming. However, they simply return this information in a probabilistic manner, producing plausible outputs, without verifying the facts.

OpenAI, the developer of ChatGPT, released an upgrade called GPT-4 in March 2023. Microsoft has partnered with OpenAI and incorporated GPT-4 in Copilot, which is a specialized assistant that works with Microsoft products, but can also be used for other purposes. Generative AI is also now included in Bing and Google search engines.

I first played with ChatGPT shortly after it was released, simply generating text and poetry in English and Afrikaans for amusement. I started experimenting with GPT-4 to assist with writing the President’s Corner articles. It is very easy to generate paragraphs with simple instructions. These can then be modified with further instructions until you get something useful. You can choose between precise, creative, or balanced styles. As a test, I attempted to have GPT-4 write an entire article for me. It produced a comprehensive well written article; however, I found it challenging to get it to convey the specific message I wanted to communicate. Additionally, it generated a substantial amount of information that was unfamiliar to me and difficult to verify.

The next step was to utilize the generative AI capability in Google. I found this to be extremely useful as it generates a summary of the information along with links to additional resources, allowing you to verify the information and identify the source. The source data can include news articles, research papers, or presentations, provided they are available on the internet. This significantly accelerates the literature research process.

GPT-4 can also summarize articles very neatly and efficiently; however, I found that it did not always extract the most relevant information for my purposes and invariably required some editing. It is important to note that articles uploaded to GPT-4 for summarization are added to its database, making them accessible to everyone. This is acceptable if the article is already in the public domain and available on the internet: if it is not, there is a risk of disseminating confidential information. While there are methods to protect data while still using the GPT-4 engine, these protections are not available when using the freely accessible version.

I have also found GPT-4 to be very useful for enhancing style and grammar. Typically, I jot down a few sentences quickly without focusing too much on flow or repetition, and then ask GPT-4 to rewrite the paragraph. The results are generally improved, but may still require further manual editing to ensure the correct message is conveyed. There are other tools, such as Wordtune, Paperpal, and Grammarly, that can be used for the same purpose.

The integration of AI into the realm of scientific writing has revolutionized the way researchers draft, edit, and finalize their manuscripts. A Nature survey (https://www.nature.com/articles/d41586-023-02988-6) of 1600 researchers from around the world found that AI is being used to process data, write code, and assist with the writing of papers. It is particularly helpful for researchers whose first language is not English, but need to publish their work in English journals. Scientists are using AI to improve style and grammar and to summarize other articles.

However, there is a risk that research integrity can be compromised and fake papers can be produced. This has significant implications for the peer review process and has been an important topic of discussion for the SAIMM Publications Committee. The Academy of Science of South Africa (ASSAf) have drafted guidelines for the use of AI tools and resources in research communication, taking into consideration the views of several international scientific societies and journal publishers’ websites. https://www.assaf.org.za/wp-content/uploads/2024/09/ASSAf-and-SciELO-DRAFT-Guidelines-for-the-Useof- Artificial-Intelligence-AI-Tools-and-Resources-in-Research-Communication_-4-Sept-2024.pdf

The guideline states that ‘Authors are solely responsible for ensuring the authenticity, validity, and integrity of the content in their manuscripts.’ It is essential for authors to prevent misinformation that is generated by AI tools from being included in papers, because this may impact the quality of future research and global knowledge. Any information generated by AI must be correctly cited and citations generated by AI must be checked. Where content is generated by AI and the source cannot be determined, the guideline provides recommendations on how to reference the AI tool and method of generation. Transparency is important and the use of AI tools should be disclosed; however, it is not necessary to disclose the use of tools to improve grammar and style. The guideline also provides recommendations for editors and reviewers. In addition to their usual responsibility for validation of scientific content, editors and reviewers must consider the effects of AIgenerated content in a publication. AI tools for editing, reviewing, and plagiarism checking must be used in a responsible manner. Reviewers and editors are still required to make decisions regarding the evaluation of manuscripts.

In closing, AI tools have the potential to significantly enhance the efficiency and quality of scientific writing. However, their use must be guided by ethical considerations to ensure the integrity and reliability of scientific research. By understanding and responsibly applying these tools, researchers can leverage AI to advance their work while upholding the standards of academic writing.

W.C. Joughin
President, SAIMM

Artificial Intelligence in the preparation of scientific documents

Journal

Why become a member

See more here...