Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Week Notes: Elevation Specified and Mostly Solved; Datasette Spatialite Windows Fix;

 It's been a busy and fun week again!

Most of my time was spenting finalizing what the elevation profile Datasette enrichment would look like. I finally found the last piece of the puzzle laying around in a pile of Jinja template use practice. One of the goals this week will be to release the elevation package.

I made my first pull request for an open source project, datasette, in this case. (I make them at workr, but this is my first foray into the wider world.) 

ChatGPT provided the final kick to get the elevation maps plotted really eaisly.

Subsequently, I learned about the Pyplot package and how to add labels to charts.

I got a start back towards writing about dadding. I spruced up some writing I had lying around about cleaning the kitchen with the gang. I found another older piece that I love while checking out old blog entries here: Meerktas and Ravens. Oh, and then this one about what unschooling with toddlers lools like.

Goals

  • Release the elevation package
  • Release a QRZ QSL enrichment for Datasette
  • Write more about dadding and unschooling
  • Release the KML QSO mapping plug-in for Datasette


ChatGPT, Datasette-Extract, and the US Ham Radio General Exam Question Pool

 I started a project, ahem, yesterday to 'quickly' see if ChatGPT could read the entire United States general class amateur radio exam question pool into a Datasette instance using the datasette-extract plugin. As of this morning, I haven't been able to coax ChatGPT using the gpt-4-turbo model. I capture my rather raw notes below. The short version of this is that I was never able to get the AI to capture more than 19 questions at a time. I'm hopeful that the pool could be moved into a database table using iterative processes, but for now, I've run out of time for this quick project :) 

Occasionally ChatGPT seemed to hallucinate out part of its process into the table


Notes Follow

I'm going to track how easy it is to get the general exam question pool into a database using the Datasette Plugin. I started this endeavor at 20:37 UTC.


Get my already existent OpenAI API key ready to go

20:43: Done. As usual with OpenAI, the hardest part was finding login screens and then the API. Finally did a Google search to find the API.


Install the datasette-extract plugin

I've run into an issue here. I think I have too old of a version of Datasette and Windows can't figure out how to uninstall it

Using cached datasette_extract-0.1a6-py3-none-any.whl (815 kB)

Using cached datasette-1.0a13-py3-none-any.whl (302 kB)

Using cached datasette_secrets-0.1a4-py3-none-any.whl (12 kB)

Installing collected packages: datasette, datasette-secrets, datasette-extract

  Attempting uninstall: datasette

    Found existing installation: datasette 1.0a3

    Uninstalling datasette-1.0a3:

ERROR: Could not install packages due to an OSError: [WinError 32] The process cannot access the file because it is being used by another process: 'c:\\users\\m3n7es\\appdata\\local\\packages\\pythonsoftwarefoundation.python.3.11_qbz5n2kfra8p0\\localcache\\local-packages\\python311\\scripts\\datasette.exe'

Check the permissions.

I'll clone a dev environment for the plugin and then run in venv. Time now 21:00.

Still Installing

21:05 OK! pytest passes!

Adding Table Column Names

This is easy since I've already got a table for the general exam pool. The headings are:

id question class subelement group_index group_number answer answer_a answer_b answer_c answer_d 

21:21 The column names have been defined with hints. 

id primary key

question follows a line starting with G ends with '?'

class Defaults to G for every question

subelement A number following G before a second letter

group_index The letter following subelement's number (G)(\d)(A-Z)(\d\d) Use \$3

group_number two digit number following group_index (G)(\d)(A-Z)(\d\d) use \$4

answer A single letter between parentheses that indicates the correct answer choice

answer_a next line starting with 'A.'

answer_b next line starting with 'B.'

answer_c next line starting with 'C.'

answer_d next line starting with 'D.'

I added the additional instructions

The questions and answers are in line sorted by headings that contain class (always G), then subelement (a single digit following G), then group_index (a single letter following the subelement), then group_number (a question number within the group_index), then the single letter correct answer enclosed in parentheses. The next line contains the entire question text for the question field. The next four lines in each question contain the four possible answers. The end of each question is denoted by '~~'.

I've copied the entire question pool starting at 





and ending at 



into the tool. Now, I'll press 'Extract'



Time is 21:26 UTC

Extracting to Table

Got back this error message:

Error: Error code: 404 - {'error': {'message': 'The model `gpt-4-turbo` does not exist or you do not have access to it.', 'type': 'invalid_request_error', 'param': None, 'code': 'model_not_found'}}

Extraction failed


OK. Looking at My OpenAI account I see:


No gpt-4-turbo. So, that's a bit of a challenge.

OK! The API is like using a Clipper Card on BART. You have to pay up front


I put some money in the account. 

I'll try to extract again. It's 21:43.

It's Working!!!

[
  {
    "id": 1,
    "question": "On which HF and/or MF amateur bands are there portions where General class licensees cannot transmit?",
    "class": "G",
    "subelement": "G1",
    "group_index": "A",
    "group_number": "01",
    "answer": "C",
    "answer_a": "60 meters, 30 meters, 17 meters, and 12 meters",
    "answer_b": "160 meters, 60 meters, 15 meters, and 12 meters",
    "answer_c": "80 meters, 40 meters, 20 meters, and 15 meters",
    "answer_d": "80 meters, 20 meters, 15 meters, and 10 meters"
  },
  {
    "id": 2,
    "question": "On which of the following bands is phone operation prohibited?",

The engine is still cranking along at 21:47.

And Then </exceeds>

  {
    "id": 19,
    "question": "When is it permissible to communicate with amateur stations in countries outside the areas administered by the Federal Communications Commission?",
    "class": "G",
    "subelement": "G1",
    "group_index": "B",
    "group_number": "08",
    "answer": "B",
    "answer_a": "Only when the foreign country has a formal third-party agreement filed with the FCC",
    "answer_b": "When the contact is with amateurs in any country except those whose administrations have notified the ITU that they object to such communications",
    "answer_c": "Only when the contact is with amateurs licensed by a country whic...  Click to expand ... <exceeds maximum number of characters> ,,groupId,,quizzes,,element,,data,,result,,direct,,[]}]}]}</exceeds>}]}]}</exceeds>}]}]}</exceeds>}]}]}</exceeds>}]}]}</exceeds>}]}]}</exceeds>}]}]}</exceeds>}]}]}</exceeds>}]}]},"
  }
]

Did I hit the end of my billing envelope?

21:51 No, billing seems fine. I wonder is I need to add the file in as a pdf because of this message:

exceeds maximum number of characters

Trying again with a pdf file

21:59 Dropping in a pdf fil resulted in 'Processing...' message for the last 8 minutes. Trying this a subelement (subelement_group? since it didn't complete a subelement) at a time.

Full Subelement at a time

Back up and running at 22:01.
 
Well, shucks, that time it only pulled out two questions. Also, it didn't create the table even though it said it did:







Error 404

Table not found: ham_exam_general_question_pool


I'll try a db that doesn't revolve around a memory table next.

No Memory Table DBs

What could have been really bothersome was a breeze. The table columns auto-populated for me!

'Additional instructions' was not auto-populated, so WooooHooooo!!! blogging. Meaning, I'm really happy I documented my instructions a few paragraphs back.

22:11 Pushed the 'Extract' button. Results started coming in a few seconds later.

Nuts! It got three questions out this time, but that's it! What's the difference in setups???

Adding Remaining SubElement Group by Hand 


Starting at 22:22

22:24 That worked. The entire G1A subelement group is in the table.

Can it do two subelement groups?

22:26 Input subelement group B and C

22:27 Both subelement groups have been successfully added.

The rest of the groups in the subelement?

Again that's two subelemetn groups, D and E, but it only pulled out one question: the last one in the C group that I accidentally copied back in. Nuts! 

Removed the row, removed the input, trying again at 22:32

Made it through the D subelement group and then stopped on 

"G1E – Control categories; repeater regulations; third-party rules; ITU regions; automatically controlled digital station"

I think I see the game. I'll take out the group descriptions and add all the text in to see if I can be deon with this. 23:34

Descriptions Removed

23:42 back up and running with all the descriptions removed. We'll see how this goes.

It's taking about four seconds per exam question to figure out the correct extraction.

After 
 "id": "G1E12",
decided it was done

Remember how the ids started out as numbers? Weird.
Note: Updating the following morning. Not weird. I forgot to set the field type to integer.

More Instructions

22:49
Added these additional instructions:

"When the subelement changes, or the subelement group changeds, keep going please. The end of the question pool is deonted by '~~~end of question pool text~~~' You're doing a great job, but please get every additional question this time."

and trying again.

22:49 Three questions have come back. It seems to be thinking now?

22:50 (Yes, I know it's not actually thinking.)

22:51 Calling this. Still at three additional quesitons.

Don't give away the ending

I took away the instruction about how to find the end of the pool. as well as the line about 'every additional question'

22:54 Successfully crossed from G2A12 to G2B01

22:54 And now from G2B11 to G2C01

22:55 Stopped at G2C08. Why???

Did ChatGPT read the question? 'What prosign is sent to indicate the end of a formal message when using CW?'

22:59 Made the hop to G3A01 and then promptly decided it was done again.

There were two blank lines above that question rather than one. Is that why?

23:02 started it back up.

23:02 Stopped atain at G3A14.

Again, there are three blank lines after this question rather than one.

23:05 Added 'The number of blank lines between questions is NOT significant.' to the Additional instructions.

Stopped two questions later at G3B02.

23:06 Starting again.

Two questions again. Taking away the last instruction.


23:38 So Tired
Got this error a few rows in 


After changing 'Additional instructions' to 

"IGNORE ALL BLANK LINES in content. Extract all data from content according to the following instructions. Rows will always begin with the pattern (G)(\d)([A-J])(\d\d)(\s*)([A-D]) and end with a line containing '~~' The questions and answers are in line sorted by headings that contain class (always G), then subelement (a single digit following G), then group_index (a single letter following the subelement), then group_number (a question number within the group_index), then the single letter correct answer enclosed in parentheses. The next line contains the entire question text for the question field. The next four lines in each question contain the four possible answers. The end of each question is denoted by '~~'"

Let's flush the state and start over

Looking above, the plug-in did as well as it ever did before I tried all the above experiments. One thing I hadn't realized, (although I'd documented it), was that I accidentally changed the key to be text on my second try. I'm moving back to the original material copied in and the original instructinos with a numeric key.

First, I tried without a new key and wound up only getting two questions back. Just as bad as ever. 
Changing all the fields with numbers to integer resulted in one question.

I'm going to create a new OpenAI key and start on a clean database.

New database, new key, new table name wound up with 13 questions on the first try. I don't think I'

Ham Radio QSO Elevation Profiles Up and Running or ChatGPT to the Rescue

 The initial release of the ham radio QSO elevation profiles Datasette enrichment is up and running! 

(In the next few days, I expect there to be several posts on this.)

First, let's talk about the beauty of ChatGPT for personal coding challenges, my personal coding challenges. Over the course of the last week, I've tugged back and forth at the the problem of moving data from the Google Maps Elevation API into a png chart that I thought was going to be provided by the Google Chart API. All of this work taught me a lot, but did not produce useful results.

Yesterday though, I had ChatGPT open seeing what sort of job it could do creating electronic circuit schematics (spoiler: the results aren't pragmatically useful yet, although, they are pretty.) Since I had the app open anyway, I popped over to my ongoing conversation with ChatGPT about my fictional friend Penelope. (You can see her prompt below.) Here's what I jotted down to see if Penelope could solve my charting issues:

How would penelope make a line chart of two dimensional data (distance on the x axis , elevation on the y axis) in python, then create a png file of the chart, then dump out the PNG encoding of the chart so that it could be included in a data:image/png;base64 html tag

ChatGPT came back with an answer a few seconds later that was unlike anything I'd tried so far.

import matplotlib.pyplot as plt # Sample data (distance and elevation) distance = [0, 1, 2, 3, 4, 5] elevation = [100, 120, 110, 130, 125, 140] # Create line chart plt.plot(distance, elevation) plt.xlabel('Distance') plt.ylabel('Elevation') plt.title('Elevation Profile') plt.grid(True) # Save the chart as a PNG file plt.savefig('elevation_chart.png') # Show the chart (optional) # plt.show()

That would get me a chart. But how to get the chart back as png encoded text? ChatGPT wasn't done yet...

import base64 # Read the PNG file as binary with open('elevation_chart.png', 'rb') as file: png_data = file.read() # Encode the PNG image as base64 string encoded_png = base64.b64encode(png_data).decode('utf-8')

and that was that!

It took some tinkering, but I wound up with an enrichment that returned text based png images per row in our Datasette QSO log like the following:


Corresponding to transmitting from Lands End just north of the Legion of Honor towards Idaho.


There are definitely scaling issues what could be worked out, but I think the most useful thing would be to simply print a caption on the figure that indicates the angle of the slope over the first two wavelengths, (so in this case, the first 40 meters.)

To wrap up the post, I was curious about the size of png text encoded data vs the actual charts. It turns out that the chart shown above when stored as an image consumes 23kb of disc space. The encoded text when stored in a text file consumes 30kb of disc space, so not a whole lot of overhead. Awesome!

Penelope's Prompt

Let's talk about my friend, no, my hero, whose name is Penelope. Penelope learned to program when she was 12. Back then, there were lots of data feeds served over telnet. As she grew in her chosen profession, Penelope became an expert at Python.




The Capabilities of High Altitude Balloons

12 February 2023 at 21:42

In this MiscDotGeek article, we’re trying something… different. This is an experiment. You know my writing, you know my style. Read the article. Poke holes in it. Why? See at the end (don’t cheat!) High Altitude Balloons: Unlocking the Potential of Near Space Exploration Have you ever wondered what the view from the edge of …

Continue reading

The post The Capabilities of High Altitude Balloons appeared first on MiscDotGeek.

❌
❌