The Application Development Experiences of an Enterprise Engineer

Tag: mastodon

Continuing a Conversation on LLMs

Posted by bsstahl on 2023-04-13 and Filed Under: tools 


This post is the continuation of a conversation from Mastodon. The thread begins here.

Update: I recently tried to recreate the conversation from the above link and had to work far harder than I would wish to do so. As a result, I add the following GPT summary of the conversation. I have verified this summary and believe it to be an accurate, if oversimplified, representation of the thread.

The thread discusses the value and ethical implications of Language Learning Models (LLMs).

  • @arthurdoler@mastodon.sandwich.net criticizes the hype around LLMs, arguing that they are often used unethically, or suffer from the same bias and undersampling problems as previous machine learning models. He also questions the value they bring, suggesting they are merely language toys that can't create anything new but only reflect what already exists.

  • @bsstahl@CognitiveInheritance.com, however, sees potential in LLMs, stating that they can be used to build amazing things when used ethically. He gives an example of how even simple autocomplete tools can help generate new ideas. He also mentions how earlier LLMs like Word2Vec were able to find relationships that humans couldn't. He acknowledges the potential dangers of these tools in the wrong hands, but encourages not to dismiss them entirely.

  • @jeremybytes@mastodon.sandwich.net brings up concerns about the misuse of LLMs, citing examples of false accusations made by ChatGPT. He points out that people are treating the responses from these models as facts, which they are not designed to provide.

  • @bsstahl@CognitiveInheritance.com agrees that misuse is a problem but insists that these tools have value and should be used for legitimate purposes. He argues that if ethical developers don't use these tools, they will be left to those who misuse them.


I understand and share your concerns about biased training data in language models like GPT. Bias in these models exists and is a real problem, one I've written about in the past. That post enumerates my belief that it is our responsibility as technologists to understand and work around these biases. I believe we agree in this area. I also suspect that we agree that the loud voices with something to sell are to be ignored, regardless of what they are selling. I hope we also agree that the opinions of these people should not bias our opinions in any direction. That is, just because they are saying it, doesn't make it true or false. They should be ignored, with no attention paid to them whatsoever regarding the truth of any general proposition.

Where we clearly disagree is this: all of these technologies can help create real value for ourselves, our users, and our society.

In some cases, like with crypto currencies, that value may never be realized because the scale that is needed to be successful with it is only available to those who have already proven their desire to fleece the rest of us, and because there is no reasonable way to tell the scammers from legit-minded individuals when new products are released. There is also no mechanism to prevent a takeover of such a system by those with malicious intent. This is unfortunate, but it is the state of our very broken system.

This is not the case with LLMs, and since we both understand that these models are just a very advanced version of autocomplete, we have at least part of the understanding needed to use them effectively. It seems however we disagree on what that fact (that it is an advanced autocomplete) means. It seems to me that LLMs produce derivative works in the same sense (not method) that our brains do. We, as humans, do not synthesize ideas from nothing, we build on our combined knowledge and experience, sometimes creating things heretofore unseen in that context, but always creating derivatives based on what came before.

Word2Vec uses a 60-dimensional vector store. GPT-4 embeddings have 1536 dimensions. I certainly cannot consciously think in that number of dimensions. It is plausible that my subconscious can, but that line of thinking leads to the the consideration of the nature of consciousness itself, which is not a topic I am capable of debating, and somewhat ancillary to the point, which is: these tools have value when used properly and we are the ones who can use them in valid and valuable ways.

The important thing is to not listen to the loud voices. Don't even listen to me. Look at the tools and decide for yourself where you find value, if any. I suggest starting with something relatively simple, and working from there. For example, I used Bing chat during the course of this conversation to help me figure out the right words to use. I typed in a natural-language description of the word I needed, which the LLM translated into a set of possible intents. Bing then used those intents to search the internet and return results. It then used GPT to summarize those results into a short, easy to digest answer along with reference links to the source materials. I find this valuable, I think you would too. Could I have done something similar with a thesaurus, sure. Would it have taken longer: probably. Would it have resulted in the same answer: maybe. It was valuable to me to be able to describe what I needed, and then fine-tune the results, sometimes playing-off of what was returned from the earlier requests. In that way, I would call the tool a force-multiplier.

Yesterday, I described a fairly complex set of things I care to read about when I read social media posts, then asked the model to evaluate a bunch of posts and tell me whether I might care about each of those posts or not. I threw a bunch of real posts at it, including many where I was trying to trick it (those that came up in typical searches but I didn't really care about, as well as the converse). It "understood" the context (probably due to the number of dimensions in the model and the relationships therein) and labeled every one correctly. I can now use an automated version of this prompt to filter the vast swaths of social media posts down to those I might care about. I could then also ask the model to give me a summary of those posts, and potentially try to synthesize new information from them. I would not make any decisions based on that summary or synthesis without first verifying the original source materials, and without reasoning on it myself, and I would not ever take any action that impacts human beings based on those results. Doing so would be using these tools outside of their sphere of capabilities. I can however use that summary to identify places for me to drill-in and continue my evaluation, and I believe, can use them in certain circumstances to derive new ideas. This is valuable to me.

So then, what should we build to leverage the capabilities of these tools to the benefit of our users, without harming other users or society? It is my opinion that, even if these tools only make it easier for us to allow our users to interact with our software in more natural ways, that is, in itself a win. These models represent a higher-level of abstraction to our programming. It is a more declarative mechanism for user interaction. With any increase in abstraction there always comes an increase in danger. As technologists it is our responsibility to understand those dangers to the best of our abilities and work accordingly. I believe we should not be dismissing tools just because they can be abused, and there is no doubt that some certainly will abuse them. We need to do what's right, and that may very well involve making sure these tools are used in ways that are for the benefit of the users, not their detriment.

Let me say it this way: If the only choices people have are to use tools created by those with questionable intent, or to not use these tools at all, many people will choose the easy path, the one that gives them some short-term value regardless of the societal impact. If we can create value for those people without malicious intent, then the users have a choice, and will often choose those things that don't harm society. It is up to us to make sure that choice exists.

I accept that you may disagree. You know that I, and all of our shared circle to the best of my knowledge, find your opinion thoughtful and valuable on many things. That doesn't mean we have to agree on everything. However, I hope that disagreement is based on far more than just the mistrust of screaming hyperbolists, and a misunderstanding of what it means to be a "overgrown autocomplete".

To be clear here, it is possible that it is I who is misunderstanding these capabilities. Obviously, I don't believe that to be the case but it is always a possibility, especially as I am not an expert in the field. Since I find the example you gave about replacing words in a Shakespearean poem to be a very obvious (to me) false analog, it is clear that at lease one of us, perhaps both of us, are misunderstanding its capabilities.

I still think it would be worth your time, and a benefit to society, if people who care about the proper use of these tools, would consider how they could be used to society's benefit rather than allowing the only use to be by those who care only about extracting value from users. You have already admitted there are at least "one and a half valid use cases for LLMs". I'm guessing you would accept then that there are probably more you haven't seen yet. Knowing that, isn't it our responsibility as technologists to find those uses and work toward creating the better society we seek, rather than just allowing extremists to use it to our detriment.


Update: I realize I never addressed the issue of the models being trained on licensed works.

Unless a model builder has permission from a user to train their models using that user's works, be it an OSS or Copyleft license, explicit license agreement, or direct permission, those items should not be used to train models. If it is shown that a model has been trained using such data sets, and there have been indications (unproven as yet to my knowledge) that this may be the case for some models, especially image-generators, then that is a problem with those models that needs to be addressed. It does not invalidate the general use of these models, nor is it an indictment of any person or model except those in violation. Our trademark and copyright systems are another place where we, as a society, have completely fallen-down. Hopefully, that collapse will not cause us to forsake the value that these tools can provide.

Tags: coding-practices development enterprise responsibility testing ai algorithms ethics mastodon 

Social Media

Posted by bsstahl on 2022-11-11 and Filed Under: general 


The implosion of Twitter and my subsequent move to the Fediverse has me reviewing all of my social media activity.

A few of the things I've looked at, and continue to investigate, include:

  • How and why I use each platform
  • How has my activity changed over time
  • What previous statements I've made should be corrected, amended, or otherwise revisited

The revisiting of previous statements will likely happen either on the platform where they originated, or via microblog commentary @bsstahl@cognitiveinheritance.com. The rest of the analysis can be found here for everyone's benefit and comment. Of course, all comments, as indicated below, should be directed to my microblog @bsstahl@cognitiveinheritance.com.

My platforms:

  • My Blog:
    • How I use it: Long form posts, usually technical in nature, that describe a concept or methodology.
    • Future Plans: I hope to continue to use this platform for a long time and would like to be more active. However, I have said that many times and never been able to keep-up a good cadence.
  • Microblogging @bsstahl@fosstodon.org but previously on Twitter:
    • How I use it:
      • Real-time communication about events such as tech conferences with other attendees
      • Keeping in-touch with friends I met at events, usually without even having to directly interact
      • Asking for input on concepts or ideas on how to do/use a tool or technology
      • Asking for comments on my blog posts or presentations
      • Promoting my or other speakers/writers posts or talks, especially when I attend a talk at a conference
      • Publishing links to the code and slide-decks for my conference talks
      • Publicly whining about problems with commercial products or services
      • Making the occasional bad joke or snarky remark, usually at the expense of some celebrity, athlete or business
      • Posting individual photos of people I know or places I go
    • Future Plans: With the move to the Fediverse, I may try to focus more completely on technology on this platform. Perhaps sports-related stuff should go elsewhere, maybe a photo-blog site like PixelFed
  • Facebook:
    • How I use it:
      • Private to only family members or friends I actually know
      • Posting Photos of family and friends to a limited audience
      • Check-ins to places I'm at for future reference, especially restaurants
      • Posting political commentary and social memes
    • Future Plans: I want a place for this that is not a walled-garden like Facebook. I feel like private communities could be run on Mastodon or other Fediverse servers like PixelFed. There are a few possibilities I'm exploring.
  • Flickr:
    • How I use it:
      • Paid "professional" account where I keep my off-site backup of every digital photo I've ever taken, plus some scanned photos that were "born analog", in full-size
      • A public photostream of my favorite photos that are not family or friends
      • A restricted (to family and friends) photostream of photos of family or friends
      • Hosting of photos for my photoblog sites including GiveEmHellDevils.com
    • Future Plans: Most of this will remain though I may syphon-off specific elements to other, more federated communities. For example, the restricted photostream could move to a PixelFed server.
  • LinkedIn:
    • How I use it: A professional network of people I actually know in the technology space. I don't accept requests from people I have never met, including (especially?) recruiters. If I ever need to find a job again, it will be through referrals from people I know.
    • Future Plans: I'd like to do a better job of posting my appropriate content here, perhaps as links from my blog. Of course, that would require more posts on my blog (see above). Other than that, I don't expect any changes here.
  • YouTube:
    • How I use it:
      • In the past, I used it to post videos of family and friends, though now those are usually posted privately via Flickr or Facebook
      • Most of the time, I post videos of my technical presentations, or other presentations to the local user groups
    • Future Plans: Continue to share videos of technical content
  • Instagram
    • How I use it: To publish photos from my GiveEmHellDevils.com photoblog.
    • Future Plans: I would prefer to move this to a Fediverse service like PixelFed that is not a walled-garden. I may start by adding a second stream using the Fediverse, and see what happens. If things go in the right direction, I may be able to eliminate Instagram.
  • GitHub:
    • How I use it:
      • A public repository of my Open-Source (FOSS) projects and code samples.
      • A public repository of those FOSS projects that I contribute to via Pull-Request (PR)
      • The hosting platform for my Blog Site and my GiveEmHellDevils.com photoblog.
    • Future Plans: No changes expected
  • Azure DevOps
    • How I use it:
      • A private repository of my private code projects
      • A private repository of the source material for my presentation slides
      • A private repository of my many random experiments with code
    • Future Plans: No changes expected
  • Azure Websites
    • How I use it:
      • To publish the individual slide-decks for my presentations as listed on my blog site
    • Future Plans: No changes expected
  • TikTok
    • How I use it: I don't
    • Future Plans: None
Tags: social-media mastodon fediverse 

About the Author

Barry S. StahlBarry S. Stahl (he/him/his) - Barry is a .NET Software Engineer who has been creating business solutions for enterprise customers since the mid 1980s. Barry is also an Election Integrity Activist, baseball and hockey fan, husband of one genius and father of another, and a 40 year resident of Phoenix Arizona USA. When Barry is not traveling around the world to speak at Conferences, Code Camps and User Groups or to participate in GiveCamp events, he spends his days as a Solution Architect for Carvana in Tempe AZ and his nights thinking about the next AZGiveCamp event where software creators come together to build websites and apps for some great non-profit organizations.

For more information about Barry, see his About Me Page.

Barry has started delivering in-person talks again now that numerous mechanisms for protecting our communities from Covid-19 are available. He will, of course, still entertain opportunities to speak online. Please contact him if you would like him to deliver one of his talks at your event, either online or in-person. Refer to his Community Speaker page for available options.

Social Media

Tag Cloud